UPDATE: To help clarify what I'm asking I have posted a little java code that gets the idea across.
A while ago I asked a question on how to get an algorithm to break down a set of numbers, the idea was to give it a list of numbers (1,2,3,4,5) and a total(10) and it would figure out all the multiples of each number that would add up to the total('1*10' or '1*1,1*2,1*3,1*4' or '2*5',etc..). It was the first programming exercise I ever did so it took me a while and I got it working but now I want to try to see if I can scale it. The person in the original question said it was scalable but I'm a bit confused at how to do it. The recursive part is the area I'm stuck at scaling the part that combines all the results(the table it is referring to is not scalable but applying caching I am able to make it fast)
I have the following algorithm(pseudo code):
//generates table
for i = 1 to k
for z = 0 to sum:
for c = 1 to z / x_i:
if T[z - c * x_i][i - 1] is true:
set T[z][i] to true
//uses table to bring all the parts together
function RecursivelyListAllThatWork(k, sum) // Using last k variables, make sum
/* Base case: If we've assigned all the variables correctly, list this
* solution.
*/
if k == 0:
print what we have so far
return
/* Recursive step: Try all coefficients, but only if they work. */
for c = 0 to sum / x_k:
if T[sum - c * x_k][k - 1] is true:
mark the coefficient of x_k to be c
call RecursivelyListAllThatWork(k - 1, sum - c * x_k)
unmark the coefficient of x_k
I'm really at a loss at how to thread/multiprocess the RecursivelyListAllThatWork function. I know if I send it a smaller K( which is int of total number of items in list) it will process that subset but I don't know how to do ones that combine results across the subset. For example, if list is [1,2,3,4,5,6,7,8,9,10] and I send it K=3 then only the 1,2,3 get processed which is fine but what about if I need results that include 1 and 10? I have tried to modify the table(variable T) so only the subset I want are there but still doesn't work because, like the solution above, it does a subset but cannot process answers that require a wider range.
I don't need any code just if someone can explain how to conceptually break this recursive step to so other cores/machines can be used.
UPDATE: I still can't seem to figure out how to turn RecursivelyListAllThatWork into a runnable(I know technically how to do it, but I don't understand how to change the RecursivelyListAllThatWork algorithm so it can be ran in parallel. The other parts are just here to make the example work, I only need to implement runnable on RecursivelyListAllThatWork method). Here's the java code:
import java.awt.Point;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
public class main
{
public static void main(String[] args)
{
System.out.println("starting..");
int target_sum = 100;
int[] data = new int[] { 10, 5, 50, 20, 25, 40 };
List T = tableGeneator(target_sum, data);
List<Integer> coeff = create_coeff(data.length);
RecursivelyListAllThatWork(data.length, target_sum, T, coeff, data);
}
private static List<Integer> create_coeff(int i) {
// TODO Auto-generated method stub
Integer[] integers = new Integer[i];
Arrays.fill(integers, 0);
List<Integer> integerList = Arrays.asList(integers);
return integerList;
}
private static void RecursivelyListAllThatWork(int k, int sum, List T, List<Integer> coeff, int[] data) {
// TODO Auto-generated method stub
if (k == 0) {
//# print what we have so far
for (int i = 0; i < coeff.size(); i++) {
System.out.println(data[i] + " = " + coeff.get(i));
}
System.out.println("*******************");
return;
}
Integer x_k = data[k-1];
// Recursive step: Try all coefficients, but only if they work.
for (int c = 0; c <= sum/x_k; c++) { //the c variable caps the percent
if (T.contains(new Point((sum - c * x_k), (k-1))))
{
// mark the coefficient of x_k to be c
coeff.set((k-1), c);
RecursivelyListAllThatWork((k - 1), (sum - c * x_k), T, coeff, data);
// unmark the coefficient of x_k
coeff.set((k-1), 0);
}
}
}
public static List tableGeneator(int target_sum, int[] data) {
List T = new ArrayList();
T.add(new Point(0, 0));
float max_percent = 1;
int R = (int) (target_sum * max_percent * data.length);
for (int i = 0; i < data.length; i++)
{
for (int s = -R; s < R + 1; s++)
{
int max_value = (int) Math.abs((target_sum * max_percent)
/ data[i]);
for (int c = 0; c < max_value + 1; c++)
{
if (T.contains(new Point(s - c * data[i], i)))
{
Point p = new Point(s, i + 1);
if (!T.contains(p))
{
T.add(p);
}
}
}
}
}
return T;
}
}
The general answer to multi-threading is to de-recursivate a recursive implementation thanks to a stack (LIFO or FIFO). When implementing such an algorithm, the number of threads is a fixed parameter for the algorithm (number of cores for instance).
To implement it, the language call stack is replaced by a stack storing last context as a checkpoint when the tested condition ends the recursivity. In your case it is either k=0 or coeff values matchs targeted sum.
After de-recursivation, a first implementation is to run multiple threads to consume the stack BUT the stack access becomes a contention point because it may require synchronization.
A better scalable solution is to dedicate a stack for each thread but an initial production of contexts in the stack is required.
I propose a mix approach with a first thread working recursively for a limited number of k as a maximum recursion depth: 2 for the small data set in example, but I recommend 3 if larger. Then this first part delegates the generated intermediate contexts to a pool of threads which will process remaining k with a non-recursive implementation. This code is not based on the complex algorithm you use but on a rather "basic" implementation:
import java.util.Arrays;
import java.util.ArrayDeque;
import java.util.Queue;
import java.util.concurrent.ConcurrentLinkedQueue;
import java.util.concurrent.LinkedBlockingDeque;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;
public class MixedParallel
{
// pre-requisite: sorted values !!
private static final int[] data = new int[] { 5, 10, 20, 25, 40, 50 };
// Context to store intermediate computation or a solution
static class Context {
int k;
int sum;
int[] coeff;
Context(int k, int sum, int[] coeff) {
this.k = k;
this.sum = sum;
this.coeff = coeff;
}
}
// Thread pool for parallel execution
private static ExecutorService executor;
// Queue to collect solutions
private static Queue<Context> solutions;
static {
final int numberOfThreads = 2;
executor =
new ThreadPoolExecutor(numberOfThreads, numberOfThreads, 1000, TimeUnit.SECONDS,
new LinkedBlockingDeque<Runnable>());
// concurrent because of multi-threaded insertions
solutions = new ConcurrentLinkedQueue<Context>();
}
public static void main(String[] args)
{
int target_sum = 100;
// result vector, init to 0
int[] coeff = new int[data.length];
Arrays.fill(coeff, 0);
mixedPartialSum(data.length - 1, target_sum, coeff);
executor.shutdown();
// System.out.println("Over. Dumping results");
while(!solutions.isEmpty()) {
Context s = solutions.poll();
printResult(s.coeff);
}
}
private static void printResult(int[] coeff) {
StringBuffer sb = new StringBuffer();
for (int i = coeff.length - 1; i >= 0; i--) {
if (coeff[i] > 0) {
sb.append(data[i]).append(" * ").append(coeff[i]).append(" ");
}
}
System.out.println(sb.append("from ").append(Thread.currentThread()));
}
private static void mixedPartialSum(int k, int sum, int[] coeff) {
int x_k = data[k];
for (int c = sum / x_k; c >= 0; c--) {
coeff[k] = c;
int[] newcoeff = Arrays.copyOf(coeff, coeff.length);
if (c * x_k == sum) {
//printResult(newcoeff);
solutions.add(new Context(0, 0, newcoeff));
continue;
} else if (k > 0) {
if (data.length - k < 2) {
mixedPartialSum(k - 1, sum - c * x_k, newcoeff);
// for loop on "c" goes on with previous coeff content
} else {
// no longer recursive. delegate to thread pool
executor.submit(new ComputePartialSum(new Context(k - 1, sum - c * x_k, newcoeff)));
}
}
}
}
static class ComputePartialSum implements Callable<Void> {
// queue with contexts to process
private Queue<Context> contexts;
ComputePartialSum(Context request) {
contexts = new ArrayDeque<Context>();
contexts.add(request);
}
public Void call() {
while(!contexts.isEmpty()) {
Context current = contexts.poll();
int x_k = data[current.k];
for (int c = current.sum / x_k; c >= 0; c--) {
current.coeff[current.k] = c;
int[] newcoeff = Arrays.copyOf(current.coeff, current.coeff.length);
if (c * x_k == current.sum) {
//printResult(newcoeff);
solutions.add(new Context(0, 0, newcoeff));
continue;
} else if (current.k > 0) {
contexts.add(new Context(current.k - 1, current.sum - c * x_k, newcoeff));
}
}
}
return null;
}
}
}
You can check which thread has found outputted result and check all are involed: the main thread in recursive mode and the two thread from the pool in context stack mode.
Now this implementation is scalable when data.length is high:
the maximum recursion depth is limited to the main thread at a low level
each thread from the pool works with its own context stack without contention with others
the parameters to tune now are numberOfThreads and maxRecursionDepth
So the answer is yes, your algorithm can be parallelized. Here is a fully recursive implementation based on your code:
import java.awt.Point;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.ArrayDeque;
import java.util.Queue;
import java.util.concurrent.ConcurrentLinkedQueue;
import java.util.concurrent.LinkedBlockingDeque;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;
public class OriginalParallel
{
static final int numberOfThreads = 2;
static final int maxRecursionDepth = 3;
public static void main(String[] args)
{
int target_sum = 100;
int[] data = new int[] { 50, 40, 25, 20, 10, 5 };
List T = tableGeneator(target_sum, data);
int[] coeff = new int[data.length];
Arrays.fill(coeff, 0);
RecursivelyListAllThatWork(data.length, target_sum, T, coeff, data);
executor.shutdown();
}
private static void printResult(int[] coeff, int[] data) {
StringBuffer sb = new StringBuffer();
for (int i = coeff.length - 1; i >= 0; i--) {
if (coeff[i] > 0) {
sb.append(data[i]).append(" * ").append(coeff[i]).append(" ");
}
}
System.out.println(sb.append("from ").append(Thread.currentThread()));
}
// Thread pool for parallel execution
private static ExecutorService executor;
static {
executor =
new ThreadPoolExecutor(numberOfThreads, numberOfThreads, 1000, TimeUnit.SECONDS,
new LinkedBlockingDeque<Runnable>());
}
private static void RecursivelyListAllThatWork(int k, int sum, List T, int[] coeff, int[] data) {
if (k == 0) {
printResult(coeff, data);
return;
}
Integer x_k = data[k-1];
// Recursive step: Try all coefficients, but only if they work.
for (int c = 0; c <= sum/x_k; c++) { //the c variable caps the percent
if (T.contains(new Point((sum - c * x_k), (k-1)))) {
// mark the coefficient of x_k to be c
coeff[k-1] = c;
if (data.length - k != maxRecursionDepth) {
RecursivelyListAllThatWork((k - 1), (sum - c * x_k), T, coeff, data);
} else {
// delegate to thread pool when reaching depth 3
int[] newcoeff = Arrays.copyOf(coeff, coeff.length);
executor.submit(new RecursiveThread(k - 1, sum - c * x_k, T, newcoeff, data));
}
// unmark the coefficient of x_k
coeff[k-1] = 0;
}
}
}
static class RecursiveThread implements Callable<Void> {
int k;
int sum;
int[] coeff;
int[] data;
List T;
RecursiveThread(int k, int sum, List T, int[] coeff, int[] data) {
this.k = k;
this.sum = sum;
this.T = T;
this.coeff = coeff;
this.data = data;
System.out.println("New job for k=" + k);
}
public Void call() {
RecursivelyListAllThatWork(k, sum, T, coeff, data);
return null;
}
}
public static List tableGeneator(int target_sum, int[] data) {
List T = new ArrayList();
T.add(new Point(0, 0));
float max_percent = 1;
int R = (int) (target_sum * max_percent * data.length);
for (int i = 0; i < data.length; i++) {
for (int s = -R; s < R + 1; s++) {
int max_value = (int) Math.abs((target_sum * max_percent) / data[i]);
for (int c = 0; c < max_value + 1; c++) {
if (T.contains(new Point(s - c * data[i], i))) {
Point p = new Point(s, i + 1);
if (!T.contains(p)) {
T.add(p);
}
}
}
}
}
return T;
}
}
1) Instead of
if k == 0:
print what we have so far
return
you can check to see how many coefficients are non-zero; if that count is greater than a certain threshold (3 in your example), then just don't print it. (Hint: this would be closely related to the
mark the coefficient of x_k to be c
line.)
2) Recursive functions are generally exponential in nature, which means that as you scale higher, the runtime will grow sharply larger.
With that in mind, you can apply multithreading to both calculating the table and the recursive function.
When considering the table, think about which parts of the loop affect each other and must be done in sequence; the converse, of course, is finding which parts don't affect each other and can be run in parallel.
As for the recursive function, your best bet would probably be to apply the multithreading to the branching part.
They key to making this multithreaded is just to make sure that you don't have unnecessary global data structures, like your "marks" on the coefficients.
Let's say you have K numbers n[0] ... n[K-1] in your table and the sum you want to reach is S. I assume below that the array n[] is sorted from smallest to largest number.
A simple enumeration algorithm is here. i is index to the list of numbers, s is the current sum already built, and cs is a list of coefficients for the numbers 0 .. i - 1:
function enumerate(i, s, cs):
if (s == S):
output_solution(cs)
else if (i == K):
return // dead end
else if ((S - s) < n[i]):
return // no solution can be found
else:
for c in 0 .. floor((S - s) / n[i]): // note: floor(...) > 0
enumerate(i + 1, s + c * n[i], append(cs, c))
To run the process:
enumerate(0, 0, make_empty_list())
Now here are no global data structures anymore, except the table n[] (constant data), and 'enumerate' also does not return anything, so you can change the recursive call to run in its own thread at your will. E.g. you can spawn a new thread to a recursive enumerate() call unless you have too many threads running already, in which case you wait.
Related
I need a task about finding Fibonacci Sequence for my independent project in Java. Here are methods for find.
private static long getFibonacci(int n) {
switch (n) {
case 0:
return 0;
case 1:
return 1;
default:
return (getFibonacci(n-1)+getFibonacci(n-2));
}
}
private static long getFibonacciSum(int n) {
long result = 0;
while(n >= 0) {
result += getFibonacci(n);
n--;
}
return result;
}
private static boolean isInFibonacci(long n) {
long a = 0, b = 1, c = 0;
while (c < n) {
c = a + b;
a = b;
b = c;
}
return c == n;
}
Here is main method:
long key = getFibonacciSum(n);
System.out.println("Sum of all Fibonacci Numbers until Fibonacci[n]: "+key);
System.out.println(getFibonacci(n)+" is Fibonacci[n]");
System.out.println("Is n2 in Fibonacci Sequence ?: "+isInFibonacci(n2));
Codes are completely done and working. But if the n or n2 will be more than normal (50th numbers in Fib. Seq.) ? Codes will be runout. Are there any suggestions ?
There is a way to calculate Fibonacci numbers instantaneously by using Binet's Formula
Algorithm:
function fib(n):
root5 = squareroot(5)
gr = (1 + root5) / 2
igr = 1 - gr
value = (power(gr, n) - power(igr, n)) / root5
// round it to the closest integer since floating
// point arithmetic cannot be trusted to give
// perfect integer answers.
return floor(value + 0.5)
Once you do this, you need to be aware of the programming language you're using and how it behaves. This will probably return a floating point decimal type, whereas integers are probably desired.
The complexity of this solution is O(1).
Yes, one improvement you can do is to getFibonacciSum(): instead of calling again and again to isInFibonacci which re-calculates everything from scratch, you can do the exact same thing that isInFibonacci is doing and get the sum in one pass, something like:
private static int getFibonacciSum(int n) {
int a = 0, b = 1, c = 0, sum = 0;
while (c < n) {
c = a + b;
a = b;
sum += b;
b = c;
}
sum += c;
return sum;
}
Well, here goes my solution using a Map and some math formulas. (source:https://www.nayuki.io/page/fast-fibonacci-algorithms)
F(2k) = F(k)[2F(k+1)−F(k)]
F(2k+1) = F(k+1)^2+F(k)^2
It is also possible implement it using lists instead of a map but it is just reinventing the wheel.
When using Iteration solution, we don't worry about running out of memory, but it takes a lot of time to get fib(1000000), for example. In this solution we may be running out of memory for very very very very big inputs (like 10000 billion, idk) but it is much much much faster.
public BigInteger fib(BigInteger n) {
if (n.equals(BigInteger.ZERO))
return BigInteger.ZERO;
if (n.equals(BigInteger.ONE) || n.equals(BigInteger.valueOf(2)))
return BigInteger.ONE;
BigInteger index = n;
//we could have 2 Lists instead of a map
Map<BigInteger,BigInteger> termsToCalculate = new TreeMap<BigInteger,BigInteger>();
//add every index needed to calculate index n
populateMapWhitTerms(termsToCalculate, index);
termsToCalculate.put(n,null); //finally add n to map
Iterator<Map.Entry<BigInteger, BigInteger>> it = termsToCalculate.entrySet().iterator();//it
it.next(); //it = key number 1, contains fib(1);
it.next(); //it = key number 2, contains fib(2);
//map is ordered
while (it.hasNext()) {
Map.Entry<BigInteger, BigInteger> pair = (Entry<BigInteger, BigInteger>)it.next();//first it = key number 3
index = (BigInteger) pair.getKey();
if(index.remainder(BigInteger.valueOf(2)).equals(BigInteger.ZERO)) {
//index is divisible by 2
//F(2k) = F(k)[2F(k+1)−F(k)]
pair.setValue(termsToCalculate.get(index.divide(BigInteger.valueOf(2))).multiply(
(((BigInteger.valueOf(2)).multiply(
termsToCalculate.get(index.divide(BigInteger.valueOf(2)).add(BigInteger.ONE)))).subtract(
termsToCalculate.get(index.divide(BigInteger.valueOf(2)))))));
}
else {
//index is odd
//F(2k+1) = F(k+1)^2+F(k)^2
pair.setValue((termsToCalculate.get(index.divide(BigInteger.valueOf(2)).add(BigInteger.ONE)).multiply(
termsToCalculate.get(index.divide(BigInteger.valueOf(2)).add(BigInteger.ONE)))).add(
(termsToCalculate.get(index.divide(BigInteger.valueOf(2))).multiply(
termsToCalculate.get(index.divide(BigInteger.valueOf(2))))))
);
}
}
// fib(n) was calculated in the while loop
return termsToCalculate.get(n);
}
private void populateMapWhitTerms(Map<BigInteger, BigInteger> termsToCalculate, BigInteger index) {
if (index.equals(BigInteger.ONE)) { //stop
termsToCalculate.put(BigInteger.ONE, BigInteger.ONE);
return;
} else if(index.equals(BigInteger.valueOf(2))){
termsToCalculate.put(BigInteger.valueOf(2), BigInteger.ONE);
return;
} else if(index.remainder(BigInteger.valueOf(2)).equals(BigInteger.ZERO)) {
// index is divisible by 2
// FORMUMA: F(2k) = F(k)[2F(k+1)−F(k)]
// add F(k) key to termsToCalculate (the key is replaced if it is already there, we are working with a map here)
termsToCalculate.put(index.divide(BigInteger.valueOf(2)), null);
populateMapWhitTerms(termsToCalculate, index.divide(BigInteger.valueOf(2)));
// add F(k+1) to termsToCalculate
termsToCalculate.put(index.divide(BigInteger.valueOf(2)).add(BigInteger.ONE), null);
populateMapWhitTerms(termsToCalculate, index.divide(BigInteger.valueOf(2)).add(BigInteger.ONE));
} else {
// index is odd
// FORMULA: F(2k+1) = F(k+1)^2+F(k)^2
// add F(k+1) to termsToCalculate
termsToCalculate.put(((index.subtract(BigInteger.ONE)).divide(BigInteger.valueOf(2)).add(BigInteger.ONE)),null);
populateMapWhitTerms(termsToCalculate,((index.subtract(BigInteger.ONE)).divide(BigInteger.valueOf(2)).add(BigInteger.ONE)));
// add F(k) to termsToCalculate
termsToCalculate.put((index.subtract(BigInteger.ONE)).divide(BigInteger.valueOf(2)), null);
populateMapWhitTerms(termsToCalculate, (index.subtract(BigInteger.ONE)).divide(BigInteger.valueOf(2)));
}
}
This method of solution is called dynamic programming
In this method we are remembering the previous results
so when recursion happens then the cpu doesn't have to do any work to recompute the same value again and again
class fibonacci
{
static int fib(int n)
{
/* Declare an array to store Fibonacci numbers. */
int f[] = new int[n+1];
int i;
/* 0th and 1st number of the series are 0 and 1*/
f[0] = 0;
f[1] = 1;
for (i = 2; i <= n; i++)
{
/* Add the previous 2 numbers in the series
and store it */
f[i] = f[i-1] + f[i-2];
}
return f[n];
}
public static void main (String args[])
{
int n = 9;
System.out.println(fib(n));
}
}
public static long getFib(final int index) {
long a=0,b=0,total=0;
for(int i=0;i<= index;i++) {
if(i==0) {
a=0;
total=a+b;
}else if(i==1) {
b=1;
total=a+b;
}
else if(i%2==0) {
total = a+b;
a=total;
}else {
total = a+b;
b=total;
}
}
return total;
}
I have checked all solutions and for me, the quickest one is to use streams and this code could be easily modified to collect all Fibonacci numbers.
public static Long fibonaciN(long n){
return Stream.iterate(new long[]{0, 1}, a -> new long[]{a[1], a[0] + a[1]})
.limit(n)
.map(a->a[0])
.max(Long::compareTo)
.orElseThrow();
}
50 or just below 50 is as far as you can go with straight recursive implementation. You can switch to iterative or dynamic programming (DP) approaches if you want to go much higher than that. I suggest learning about those from this: https://www.javacodegeeks.com/2014/02/dynamic-programming-introduction.html. And don't forget to look the a solution in the comment by David therein, real efficient. The links shows how even n = 500000 can be computed instantaneously using the DP method. The link also explains the concept of "memoization" to speed up computation by storing intermediate (but later on re-callable) results.
The Problem
Given a set of integers, find a subset of those integers which sum to 100,000,000.
Solution
I am attempting to build a tree containing all the combinations of the given set along with the sum. For example, if the given set looked like 0,1,2, I would build the following tree, checking the sum at each node:
{}
{} {0}
{} {1} {0} {0,1}
{} {2} {1} {1,2} {0} {2} {0,1} {0,1,2}
Since I keep both the array of integers at each node and the sum, I should only need the bottom (current) level of the tree in memory.
Issues
My current implementation will maintain the entire tree in memory and therefore uses way too much heap space.
How can I change my current implementation so that the GC will take care of my upper tree levels?
(At the moment I am just throwing a RuntimeException when I have found the target sum but this is obviously just for playing around)
public class RecursiveSolver {
static final int target = 100000000;
static final int[] set = new int[]{98374328, 234234123, 2341234, 123412344, etc...};
Tree initTree() {
return nextLevel(new Tree(null), 0);
}
Tree nextLevel(Tree currentLocation, int current) {
if (current == set.length) { return null; }
else if (currentLocation.sum == target) throw new RuntimeException(currentLocation.getText());
else {
currentLocation.left = nextLevel(currentLocation.copy(), current + 1);
Tree right = currentLocation.copy();
right.value = add(currentLocation.value, set[current]);
right.sum = currentLocation.sum + set[current];
currentLocation.right = nextLevel(right, current + 1);
return currentLocation;
}
}
int[] add(int[] array, int digit) {
if (array == null) {
return new int[]{digit};
}
int[] newValue = new int[array.length + 1];
for (int i = 0; i < array.length; i++) {
newValue[i] = array[i];
}
newValue[array.length] = digit;
return newValue;
}
public static void main(String[] args) {
RecursiveSolver rs = new RecursiveSolver();
Tree subsetTree = rs.initTree();
}
}
class Tree {
Tree left;
Tree right;
int[] value;
int sum;
Tree(int[] value) {
left = null;
right = null;
sum = 0;
this.value = value;
if (value != null) {
for (int i = 0; i < value.length; i++) sum += value[i];
}
}
Tree copy() {
return new Tree(this.value);
}
}
The time and space you need for building the tree here is absolutely nothing at all.
The reason is because, if you're given
A node of the tree
The depth of the node
The ordered array of input elements
you can simply compute its parent, left, and right children nodes using O(1) operations. And you have access to each of those things while you're traversing the tree, so you don't need anything else.
The problem is NP-complete.
If you really want to improve performance, then you have to forget about your tree implementation. You either have to just generate all the subsets and sum them up or to use dynamic programming.
The choice depends on the number of elements to sum and the sum you want to achieve. You know the sum it is 100,000,000, bruteforce exponential algorithm runs in O(2^n * n) time, so for number below 22 it makes sense.
In python you can achieve this with a simple:
def powerset(iterable):
"powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
s = list(iterable)
return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
You can significantly improve this complexity (sacrificing the memory) by using meet in the middle technique (read the wiki article). This will decrease it to O(2^(n/2)), which means that it will perform better than DP solution for n <~ 53
After thinking more about erip's comments, I realized he is correct - I shouldn't be using a tree to implement this algorithm.
Brute force usually is O(n*2^n) because there are n additions for 2^n subsets. Because I only do one addition per node, the solution I came up with is O(2^n) where n is the size of the given set. Also, this algorithm is only O(n) space complexity. Since the number of elements in the original set in my particular problem is small (around 25) O(2^n) complexity is not too much of a problem.
The dynamic solution to this problem is O(t*n) where t is the target sum and n is the number of elements. Because t is very large in my problem, the dynamic solution ends up with a very long runtime and a high memory usage.
This completes my particular solution in around 311 ms on my machine, which is a tremendous improvement over the dynamic programming solutions I have seen for this particular class of problem.
public class TailRecursiveSolver {
public static void main(String[] args) {
final long starttime = System.currentTimeMillis();
try {
step(new Subset(null, 0), 0);
}
catch (RuntimeException ex) {
System.out.println(ex.getMessage());
final long endtime = System.currentTimeMillis();
System.out.println(endtime - starttime);
}
}
static final int target = 100000000;
static final int[] set = new int[]{ . . . };
static void step(Subset current, int counter) {
if (current.sum == target) throw new RuntimeException(current.getText());
else if (counter == set.length) {}
else {
step(new Subset(add(current.subset, set[counter]), current.sum + set[counter]), counter + 1);
step(current, counter + 1);
}
}
static int[] add(int[] array, int digit) {
if (array == null) {
return new int[]{digit};
}
int[] newValue = new int[array.length + 1];
for (int i = 0; i < array.length; i++) {
newValue[i] = array[i];
}
newValue[array.length] = digit;
return newValue;
}
}
class Subset {
int[] subset;
int sum;
Subset(int[] subset, int sum) {
this.subset = subset;
this.sum = sum;
}
public String getText() {
String ret = "";
for (int i = 0; i < (subset == null ? 0 : subset.length); i++) {
ret += " + " + subset[i];
}
if (ret.startsWith(" ")) {
ret = ret.substring(3);
ret = ret + " = " + sum;
} else ret = "null";
return ret;
}
}
EDIT -
The above code still runs in O(n*2^n) time - since the add method runs in O(n) time. This following code will run in true O(2^n) time, and is MUCH more performant, completing in around 20 ms on my machine.
It is limited to sets less than 64 elements due to storing the current subset as the bits in a long.
public class SubsetSumSolver {
static boolean found = false;
static final int target = 100000000;
static final int[] set = new int[]{ . . . };
public static void main(String[] args) {
step(0,0,0);
}
static void step(long subset, int sum, int counter) {
if (sum == target) {
found = true;
System.out.println(getText(subset, sum));
}
else if (!found && counter != set.length) {
step(subset + (1 << counter), sum + set[counter], counter + 1);
step(subset, sum, counter + 1);
}
}
static String getText(long subset, int sum) {
String ret = "";
for (int i = 0; i < 64; i++) if((1 & (subset >> i)) == 1) ret += " + " + set[i];
if (ret.startsWith(" ")) ret = ret.substring(3) + " = " + sum;
else ret = "null";
return ret;
}
}
EDIT 2 -
Here is another version uses a meet in the middle attack, along with a little bit shifting in order to reduce the complexity from O(2^n) to O(2^(n/2)).
If you want to use this for sets with between 32 and 64 elements, you should change the int which represents the current subset in the step function to a long although performance will obviously drastically decrease as the set size increases. If you want to use this for a set with odd number of elements, you should add a 0 to the set to make it even numbered.
import java.util.ArrayList;
import java.util.List;
public class SubsetSumMiddleAttack {
static final int target = 100000000;
static final int[] set = new int[]{ ... };
static List<Subset> evens = new ArrayList<>();
static List<Subset> odds = new ArrayList<>();
static int[][] split(int[] superSet) {
int[][] ret = new int[2][superSet.length / 2];
for (int i = 0; i < superSet.length; i++) ret[i % 2][i / 2] = superSet[i];
return ret;
}
static void step(int[] superSet, List<Subset> accumulator, int subset, int sum, int counter) {
accumulator.add(new Subset(subset, sum));
if (counter != superSet.length) {
step(superSet, accumulator, subset + (1 << counter), sum + superSet[counter], counter + 1);
step(superSet, accumulator, subset, sum, counter + 1);
}
}
static void printSubset(Subset e, Subset o) {
String ret = "";
for (int i = 0; i < 32; i++) {
if (i % 2 == 0) {
if ((1 & (e.subset >> (i / 2))) == 1) ret += " + " + set[i];
}
else {
if ((1 & (o.subset >> (i / 2))) == 1) ret += " + " + set[i];
}
}
if (ret.startsWith(" ")) ret = ret.substring(3) + " = " + (e.sum + o.sum);
System.out.println(ret);
}
public static void main(String[] args) {
int[][] superSets = split(set);
step(superSets[0], evens, 0,0,0);
step(superSets[1], odds, 0,0,0);
for (Subset e : evens) {
for (Subset o : odds) {
if (e.sum + o.sum == target) printSubset(e, o);
}
}
}
}
class Subset {
int subset;
int sum;
Subset(int subset, int sum) {
this.subset = subset;
this.sum = sum;
}
}
Hi I have the following method. What it does is it finds all the possible paths from the top left to bottom right of a N x M matrix. I was wondering what is the best way to optimize it for speed as it is a little slow right now. The resulted paths are then stored in a set.
EDIT I forgot to clarify you can only move down or right to an adjacent spot, no diagonals from your current position
For example
ABC
DEF
GHI
A path from the top left to bottom right would be ADEFI
static public void printPaths (String tempString, int i, int j, int m, int n, char [][] arr, HashSet<String> palindrome) {
String newString = tempString + arr[i][j];
if (i == m -1 && j == n-1) {
palindrome.add(newString);
return;
}
//right
if (j+1 < n) {
printPaths (newString, i, j+1, m, n, arr, palindrome);
}
//down
if (i+1 < m) {
printPaths (newString, i+1, j, m, n, arr, palindrome);
}
}
EDIT Here is the entirety of the code
public class palpath {
public static void main(String[] args) throws IOException {
BufferedReader br = new BufferedReader(new FileReader("palpath.in"));
PrintWriter pw = new PrintWriter(new BufferedWriter(new FileWriter("palpath.out")));
StringTokenizer st = new StringTokenizer(br.readLine());
int d = Integer.parseInt(st.nextToken());
char[][] grid = new char [d][d];
String index = null;
for(int i = 0; i < d; i++)
{
String temp = br.readLine();
index = index + temp;
for(int j = 0; j < d; j++)
{
grid[i][j] = temp.charAt(j);
}
}
br.close();
int counter = 0;
HashSet<String> set = new HashSet<String>();
printPaths ("", 0, 0, grid.length, grid[0].length, grid, set);
Iterator<String> it = set.iterator();
while(it.hasNext()){
String temp = it.next();
StringBuilder sb = new StringBuilder(temp).reverse();
if(temp.equals(sb.toString())) {
counter++;
}
}
pw.println(counter);
pw.close();
}
static public void printPaths (String tempString, int i, int j, int m, int n, char [][] arr, HashSet<String> palindrome) {
String newString = tempString + arr[i][j];
if (i == m -1 && j == n-1) {
palindrome.add(newString);
return;
}
//right
if (j+1 < n) {
printPaths (newString, i, j+1, m, n, arr, palindrome);
}
//down
if (i+1 < m) {
printPaths (newString, i+1, j, m, n, arr, palindrome);
}
}
Given a graph of length M x N, all paths from (0,0) to (M-1, N-1) that only involve rightward and downward moves are guaranteed to contain exactly M-1 moves rightward and N-1 moves downward.
This presents us with an interesting property: we can represent a path from (0,0) to (M-1, N-1) as a binary string (0 indicating a rightward move and 1 indicating a downward move).
So, the question becomes: how fast can we print out a list of permutations of that bit string?
Pretty fast.
public static void printPaths(char[][] arr) {
/* Get Smallest Bitstring (e.g. 0000...111) */
long current = 0;
for (int i = 0; i < arr.length - 1; i++) {
current <<= 1;
current |= 1;
}
/* Get Largest Bitstring (e.g. 111...0000) */
long last = current;
for (int i = 0; i < arr[0].length - 1; i++) {
last <<= 1;
}
while (current <= last) {
/* Print Path */
int x = 0, y = 0;
long tmp = current;
StringBuilder sb = new StringBuilder(arr.length + arr[0].length);
while (x < arr.length && y < arr[0].length) {
sb.append(arr[x][y]);
if ((tmp & 1) == 1) {
x++;
} else {
y++;
}
tmp >>= 1;
}
System.out.println(sb.toString());
/* Get Next Permutation */
tmp = (current | (current - 1)) + 1;
current = tmp | ((((tmp & -tmp) / (current & -current)) >> 1) - 1);
}
}
You spend a lot of time on string memory management.
Are strings in Java mutable? If you can change chars inside string, then set length of string as n+m, and use this the only string, setting (i+j)th char at every iteration. If they are not mutable, use array of char or something similar, and transform it to string at the end
For a given size N×M of the array all your paths have N+M+1 items (N+M steps), so the first step of optimization is getting rid of recursion, allocating an array and running the recursion with while on explicit stack.
Each partial path can be extended with one or two steps: right or down. So you can easily make an explicit stack with positions visited and a step taken on each position. Put the position (0,0) to the stack with phase (step taken) 'none', then:
while stack not empty {
if stack is full /* reached lower-right corner, path complete */ {
print the path;
pop;
}
else if stack.top.phase == none {
stack.top.phase = right;
try push right-neighbor with phase none;
}
else if stack.top.phase == right {
stack.top.phase = down;
try push down-neighbor with phase none;
}
else /* stack.top.phase == down */ {
pop;
}
}
If you make a few observations about your requirements you can optimise this drastically.
There will be exactly (r-1)+(c-1) steps (where r = rows and c = columns).
There will be exactly (c-1) steps to the right and (r-1) steps down.
You therefore can use numbers where a zero bit could (arbitrarily) indicate a down step while a 1 bit steps across. We can then merely iterate over all numbers of (r-1)+(c-1) bits containing just (c-1) bits set. There's a good algorithm for that at the Stanford BitTwiddling site Compute the lexicographically next bit permutation.
First a BitPatternIterator I have used before. You could pull out the code in hasNext if you wish.
/**
* Iterates all bit patterns containing the specified number of bits.
*
* See "Compute the lexicographically next bit permutation" http://graphics.stanford.edu/~seander/bithacks.html#NextBitPermutation
*
* #author OldCurmudgeon
*/
public static class BitPattern implements Iterable<BigInteger> {
// Useful stuff.
private static final BigInteger ONE = BigInteger.ONE;
private static final BigInteger TWO = ONE.add(ONE);
// How many bits to work with.
private final int bits;
// Value to stop at. 2^max_bits.
private final BigInteger stop;
// All patterns of that many bits up to the specified number of bits.
public BitPattern(int bits, int max) {
this.bits = bits;
this.stop = TWO.pow(max);
}
#Override
public Iterator<BigInteger> iterator() {
return new BitPatternIterator();
}
/*
* From the link:
*
* Suppose we have a pattern of N bits set to 1 in an integer and
* we want the next permutation of N 1 bits in a lexicographical sense.
*
* For example, if N is 3 and the bit pattern is 00010011, the next patterns would be
* 00010101, 00010110, 00011001,
* 00011010, 00011100, 00100011,
* and so forth.
*
* The following is a fast way to compute the next permutation.
*/
private class BitPatternIterator implements Iterator<BigInteger> {
// Next to deliver - initially 2^n - 1 - i.e. first n bits set to 1.
BigInteger next = TWO.pow(bits).subtract(ONE);
// The last one we delivered.
BigInteger last;
#Override
public boolean hasNext() {
if (next == null) {
// Next one!
// t gets v's least significant 0 bits set to 1
// unsigned int t = v | (v - 1);
BigInteger t = last.or(last.subtract(BigInteger.ONE));
// Silly optimisation.
BigInteger notT = t.not();
// Next set to 1 the most significant bit to change,
// set to 0 the least significant ones, and add the necessary 1 bits.
// w = (t + 1) | (((~t & -~t) - 1) >> (__builtin_ctz(v) + 1));
// The __builtin_ctz(v) GNU C compiler intrinsic for x86 CPUs returns the number of trailing zeros.
next = t.add(ONE).or(notT.and(notT.negate()).subtract(ONE).shiftRight(last.getLowestSetBit() + 1));
if (next.compareTo(stop) >= 0) {
// Dont go there.
next = null;
}
}
return next != null;
}
#Override
public BigInteger next() {
last = hasNext() ? next : null;
next = null;
return last;
}
#Override
public void remove() {
throw new UnsupportedOperationException("Not supported.");
}
#Override
public String toString() {
return next != null ? next.toString(2) : last != null ? last.toString(2) : "";
}
}
}
Using that to iterate your solution:
public void allRoutes(char[][] grid) {
int rows = grid.length;
int cols = grid[0].length;
BitPattern p = new BitPattern(rows - 1, cols + rows - 2);
for (BigInteger b : p) {
//System.out.println(b.toString(2));
/**
* Walk all bits, taking a step right/down depending on it's set/clear.
*/
int x = 0;
int y = 0;
StringBuilder s = new StringBuilder(rows + cols);
for (int i = 0; i < rows + cols - 2; i++) {
s.append(grid[y][x]);
if (b.testBit(i)) {
y += 1;
} else {
x += 1;
}
}
s.append(grid[y][x]);
// That's a solution.
System.out.println("\t" + s);
}
}
public void test() {
char[][] grid = {{'A', 'B', 'C'}, {'D', 'E', 'F'}, {'G', 'H', 'I'}};
allRoutes(grid);
char[][] grid2 = {{'A', 'B', 'C'}, {'D', 'E', 'F'}, {'G', 'H', 'I'}, {'J', 'K', 'L'}};
allRoutes(grid2);
}
printing
ADGHI
ADEHI
ABEHI
ADEFI
ABEFI
ABCFI
ADGJKL
ADGHKL
ADEHKL
ABEHKL
ADGHIL
ADEHIL
ABEHIL
ADEFIL
ABEFIL
ABCFIL
which - to my mind - looks right.
I've just been looking at the following piece of code
package test;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
public class Main {
public static void main(final String[] args) {
final int sizeA = 3;
final int sizeB = 5;
final List<int[]> combos = getAllCombinations(sizeA-1, sizeB);
int counter = 1;
for(final int[] combo : combos) {
System.out.println("Combination " + counter);
System.out.println("--------------");
for(final int value : combo) {
System.out.print(value + " ");
}
System.out.println();
System.out.println();
++counter;
}
}
private static List<int[]> getAllCombinations(final int maxIndex, final int size) {
if(maxIndex >= size)
throw new IllegalArgumentException("The maximum index must be smaller than the array size.");
final List<int[]> result = new ArrayList<int[]>();
if(maxIndex == 0) {
final int[] array = new int[size];
Arrays.fill(array, maxIndex);
result.add(array);
return result;
}
//We'll create one array for every time the maxIndex can occur while allowing
//every other index to appear, then create every variation on that array
//by having every possible head generated recursively
for(int i = 1; i < size - maxIndex + 1; ++i) {
//Generating every possible head for the array
final List<int[]> heads = getAllCombinations(maxIndex - 1, size - i);
//Combining every head with the tail
for(final int[] head : heads) {
final int[] array = new int[size];
System.arraycopy(head, 0, array, 0, head.length);
//Filling the tail of the array with i maxIndex values
for(int j = 1; j <= i; ++j)
array[size - j] = maxIndex;
result.add(array);
}
}
return result;
}
}
I'm wondering, how do I eliminate recursion from this, so that it returns a single random combination, rather than a list of all possible combinations?
Thanks
If I understand your code correctly your task is as follows: give a random combination of numbers '0' .. 'sizeA-1' of length sizeB where
the combination is sorted
each number occurs at least once
i.e. in your example e.g. [0,0,1,2,2].
If you want to have a single combination only I'd suggest another algorithm (pseudo-code):
Randomly choose the step-up positions (e.g. for sequence [0,0,1,1,2] it would be steps (1->2) & (3->4)) - we need sizeA-1 steps randomly chosen at sizeB-1 positions.
Calculate your target combination out of this vector
A quick-and-dirty implementation in java looks like follows
// Generate list 0,1,2,...,sizeB-2 of possible step-positions
List<Integer> steps = new ArrayList<Integer>();
for (int h = 0; h < sizeB-1; h++) {
steps.add(h);
}
// Randomly choose sizeA-1 elements
Collections.shuffle(steps);
steps = steps.subList(0, sizeA - 1);
Collections.sort(steps);
// Build result array
int[] result = new int[sizeB];
for (int h = 0, o = 0; h < sizeB; h++) {
result[h] = o;
if (o < steps.size() && steps.get(o) == h) {
o++;
}
}
Note: this can be optimized further - the first step generates a random permutation and later strips this down to desired size. Therefore it is just for demonstration purpose that the algorithm itself works as desired.
This appears to be homework. Without giving you code, here's an idea. Call getAllCombinations, store the result in a List, and return a value from a random index in that list. As Howard pointed out in his comment to your question, eliminating recursion, and returning a random combination are separate tasks.
I'm working on a puzzle that involves analyzing all size k subsets and figuring out which one is optimal. I wrote a solution that works when the number of subsets is small, but it runs out of memory for larger problems. Now I'm trying to translate an iterative function written in python to java so that I can analyze each subset as it's created and get only the value that represents how optimized it is and not the entire set so that I won't run out of memory. Here is what I have so far and it doesn't seem to finish even for very small problems:
public static LinkedList<LinkedList<Integer>> getSets(int k, LinkedList<Integer> set)
{
int N = set.size();
int maxsets = nCr(N, k);
LinkedList<LinkedList<Integer>> toRet = new LinkedList<LinkedList<Integer>>();
int remains, thresh;
LinkedList<Integer> newset;
for (int i=0; i<maxsets; i++)
{
remains = k;
newset = new LinkedList<Integer>();
for (int val=1; val<=N; val++)
{
if (remains==0)
break;
thresh = nCr(N-val, remains-1);
if (i < thresh)
{
newset.add(set.get(val-1));
remains --;
}
else
{
i -= thresh;
}
}
toRet.add(newset);
}
return toRet;
}
Can anybody help me debug this function or suggest another algorithm for iteratively generating size k subsets?
EDIT: I finally got this function working, I had to create a new variable that was the same as i to do the i and thresh comparison because python handles for loop indexes differently.
First, if you intend to do random access on a list, you should pick a list implementation that supports that efficiently. From the javadoc on LinkedList:
All of the operations perform as could be expected for a doubly-linked
list. Operations that index into the list will traverse the list from
the beginning or the end, whichever is closer to the specified index.
An ArrayList is both more space efficient and much faster for random access. Actually, since you know the length beforehand, you can even use a plain array.
To algorithms: Let's start simple: How would you generate all subsets of size 1? Probably like this:
for (int i = 0; i < set.length; i++) {
int[] subset = {i};
process(subset);
}
Where process is a method that does something with the set, such as checking whether it is "better" than all subsets processed so far.
Now, how would you extend that to work for subsets of size 2? What is the relationship between subsets of size 2 and subsets of size 1? Well, any subset of size 2 can be turned into a subset of size 1 by removing its largest element. Put differently, each subset of size 2 can be generated by taking a subset of size 1 and adding a new element larger than all other elements in the set. In code:
processSubset(int[] set) {
int subset = new int[2];
for (int i = 0; i < set.length; i++) {
subset[0] = set[i];
processLargerSets(set, subset, i);
}
}
void processLargerSets(int[] set, int[] subset, int i) {
for (int j = i + 1; j < set.length; j++) {
subset[1] = set[j];
process(subset);
}
}
For subsets of arbitrary size k, observe that any subset of size k can be turned into a subset of size k-1 by chopping of the largest element. That is, all subsets of size k can be generated by generating all subsets of size k - 1, and for each of these, and each value larger than the largest in the subset, add that value to the set. In code:
static void processSubsets(int[] set, int k) {
int[] subset = new int[k];
processLargerSubsets(set, subset, 0, 0);
}
static void processLargerSubsets(int[] set, int[] subset, int subsetSize, int nextIndex) {
if (subsetSize == subset.length) {
process(subset);
} else {
for (int j = nextIndex; j < set.length; j++) {
subset[subsetSize] = set[j];
processLargerSubsets(set, subset, subsetSize + 1, j + 1);
}
}
}
Test code:
static void process(int[] subset) {
System.out.println(Arrays.toString(subset));
}
public static void main(String[] args) throws Exception {
int[] set = {1,2,3,4,5};
processSubsets(set, 3);
}
But before you invoke this on huge sets remember that the number of subsets can grow rather quickly.
You can use
org.apache.commons.math3.util.Combinations.
Example:
import java.util.Arrays;
import java.util.Iterator;
import org.apache.commons.math3.util.Combinations;
public class tmp {
public static void main(String[] args) {
for (Iterator<int[]> iter = new Combinations(5, 3).iterator(); iter.hasNext();) {
System.out.println(Arrays.toString(iter.next()));
}
}
}
Output:
[0, 1, 2]
[0, 1, 3]
[0, 2, 3]
[1, 2, 3]
[0, 1, 4]
[0, 2, 4]
[1, 2, 4]
[0, 3, 4]
[1, 3, 4]
[2, 3, 4]
Here is a combination iterator I wrote recetnly
package psychicpoker;
import java.util.ArrayList;
import java.util.Collection;
import java.util.Iterator;
import java.util.List;
import static com.google.common.base.Preconditions.checkArgument;
public class CombinationIterator<T> implements Iterator<Collection<T>> {
private int[] indices;
private List<T> elements;
private boolean hasNext = true;
public CombinationIterator(List<T> elements, int k) throws IllegalArgumentException {
checkArgument(k<=elements.size(), "Impossible to select %d elements from hand of size %d", k, elements.size());
this.indices = new int[k];
for(int i=0; i<k; i++)
indices[i] = k-1-i;
this.elements = elements;
}
public boolean hasNext() {
return hasNext;
}
private int inc(int[] indices, int maxIndex, int depth) throws IllegalStateException {
if(depth == indices.length) {
throw new IllegalStateException("The End");
}
if(indices[depth] < maxIndex) {
indices[depth] = indices[depth]+1;
} else {
indices[depth] = inc(indices, maxIndex-1, depth+1)+1;
}
return indices[depth];
}
private boolean inc() {
try {
inc(indices, elements.size() - 1, 0);
return true;
} catch (IllegalStateException e) {
return false;
}
}
public Collection<T> next() {
Collection<T> result = new ArrayList<T>(indices.length);
for(int i=indices.length-1; i>=0; i--) {
result.add(elements.get(indices[i]));
}
hasNext = inc();
return result;
}
public void remove() {
throw new UnsupportedOperationException();
}
}
I've had the same problem today, of generating all k-sized subsets of a n-sized set.
I had a recursive algorithm, written in Haskell, but the problem required that I wrote a new version in Java.
In Java, I thought I'd probably have to use memoization to optimize recursion. Turns out, I found a way to do it iteratively. I was inspired by this image, from Wikipedia, on the article about Combinations.
Method to calculate all k-sized subsets:
public static int[][] combinations(int k, int[] set) {
// binomial(N, K)
int c = (int) binomial(set.length, k);
// where all sets are stored
int[][] res = new int[c][Math.max(0, k)];
// the k indexes (from set) where the red squares are
// see image above
int[] ind = k < 0 ? null : new int[k];
// initialize red squares
for (int i = 0; i < k; ++i) { ind[i] = i; }
// for every combination
for (int i = 0; i < c; ++i) {
// get its elements (red square indexes)
for (int j = 0; j < k; ++j) {
res[i][j] = set[ind[j]];
}
// update red squares, starting by the last
int x = ind.length - 1;
boolean loop;
do {
loop = false;
// move to next
ind[x] = ind[x] + 1;
// if crossing boundaries, move previous
if (ind[x] > set.length - (k - x)) {
--x;
loop = x >= 0;
} else {
// update every following square
for (int x1 = x + 1; x1 < ind.length; ++x1) {
ind[x1] = ind[x1 - 1] + 1;
}
}
} while (loop);
}
return res;
}
Method for the binomial:
(Adapted from Python example, from Wikipedia)
private static long binomial(int n, int k) {
if (k < 0 || k > n) return 0;
if (k > n - k) { // take advantage of symmetry
k = n - k;
}
long c = 1;
for (int i = 1; i < k+1; ++i) {
c = c * (n - (k - i));
c = c / i;
}
return c;
}
Of course, combinations will always have the problem of space, as they likely explode.
In the context of my own problem, the maximum possible is about 2,000,000 subsets. My machine calculated this in 1032 milliseconds.
Inspired by afsantos's answer :-)... I decided to write a C# .NET implementation to generate all subset combinations of a certain size from a full set. It doesn't need to calc the total number of possible subsets; it detects when it's reached the end. Here it is:
public static List<object[]> generateAllSubsetCombinations(object[] fullSet, ulong subsetSize) {
if (fullSet == null) {
throw new ArgumentException("Value cannot be null.", "fullSet");
}
else if (subsetSize < 1) {
throw new ArgumentException("Subset size must be 1 or greater.", "subsetSize");
}
else if ((ulong)fullSet.LongLength < subsetSize) {
throw new ArgumentException("Subset size cannot be greater than the total number of entries in the full set.", "subsetSize");
}
// All possible subsets will be stored here
List<object[]> allSubsets = new List<object[]>();
// Initialize current pick; will always be the leftmost consecutive x where x is subset size
ulong[] currentPick = new ulong[subsetSize];
for (ulong i = 0; i < subsetSize; i++) {
currentPick[i] = i;
}
while (true) {
// Add this subset's values to list of all subsets based on current pick
object[] subset = new object[subsetSize];
for (ulong i = 0; i < subsetSize; i++) {
subset[i] = fullSet[currentPick[i]];
}
allSubsets.Add(subset);
if (currentPick[0] + subsetSize >= (ulong)fullSet.LongLength) {
// Last pick must have been the final 3; end of subset generation
break;
}
// Update current pick for next subset
ulong shiftAfter = (ulong)currentPick.LongLength - 1;
bool loop;
do {
loop = false;
// Move current picker right
currentPick[shiftAfter]++;
// If we've gotten to the end of the full set, move left one picker
if (currentPick[shiftAfter] > (ulong)fullSet.LongLength - (subsetSize - shiftAfter)) {
if (shiftAfter > 0) {
shiftAfter--;
loop = true;
}
}
else {
// Update pickers to be consecutive
for (ulong i = shiftAfter+1; i < (ulong)currentPick.LongLength; i++) {
currentPick[i] = currentPick[i-1] + 1;
}
}
} while (loop);
}
return allSubsets;
}
This solution worked for me:
private static void findSubsets(int array[])
{
int numOfSubsets = 1 << array.length;
for(int i = 0; i < numOfSubsets; i++)
{
int pos = array.length - 1;
int bitmask = i;
System.out.print("{");
while(bitmask > 0)
{
if((bitmask & 1) == 1)
System.out.print(array[pos]+",");
bitmask >>= 1;
pos--;
}
System.out.print("}");
}
}
Swift implementation:
Below are two variants on the answer provided by afsantos.
The first implementation of the combinations function mirrors the functionality of the original Java implementation.
The second implementation is a general case for finding all combinations of k values from the set [0, setSize). If this is really all you need, this implementation will be a bit more efficient.
In addition, they include a few minor optimizations and a smidgin logic simplification.
/// Calculate the binomial for a set with a subset size
func binomial(setSize: Int, subsetSize: Int) -> Int
{
if (subsetSize <= 0 || subsetSize > setSize) { return 0 }
// Take advantage of symmetry
var subsetSizeDelta = subsetSize
if (subsetSizeDelta > setSize - subsetSizeDelta)
{
subsetSizeDelta = setSize - subsetSizeDelta
}
// Early-out
if subsetSizeDelta == 0 { return 1 }
var c = 1
for i in 1...subsetSizeDelta
{
c = c * (setSize - (subsetSizeDelta - i))
c = c / i
}
return c
}
/// Calculates all possible combinations of subsets of `subsetSize` values within `set`
func combinations(subsetSize: Int, set: [Int]) -> [[Int]]?
{
// Validate inputs
if subsetSize <= 0 || subsetSize > set.count { return nil }
// Use a binomial to calculate total possible combinations
let comboCount = binomial(setSize: set.count, subsetSize: subsetSize)
if comboCount == 0 { return nil }
// Our set of combinations
var combos = [[Int]]()
combos.reserveCapacity(comboCount)
// Initialize the combination to the first group of set indices
var subsetIndices = [Int](0..<subsetSize)
// For every combination
for _ in 0..<comboCount
{
// Add the new combination
var comboArr = [Int]()
comboArr.reserveCapacity(subsetSize)
for j in subsetIndices { comboArr.append(set[j]) }
combos.append(comboArr)
// Update combination, starting with the last
var x = subsetSize - 1
while true
{
// Move to next
subsetIndices[x] = subsetIndices[x] + 1
// If crossing boundaries, move previous
if (subsetIndices[x] > set.count - (subsetSize - x))
{
x -= 1
if x >= 0 { continue }
}
else
{
for x1 in x+1..<subsetSize
{
subsetIndices[x1] = subsetIndices[x1 - 1] + 1
}
}
break
}
}
return combos
}
/// Calculates all possible combinations of subsets of `subsetSize` values within a set
/// of zero-based values for the set [0, `setSize`)
func combinations(subsetSize: Int, setSize: Int) -> [[Int]]?
{
// Validate inputs
if subsetSize <= 0 || subsetSize > setSize { return nil }
// Use a binomial to calculate total possible combinations
let comboCount = binomial(setSize: setSize, subsetSize: subsetSize)
if comboCount == 0 { return nil }
// Our set of combinations
var combos = [[Int]]()
combos.reserveCapacity(comboCount)
// Initialize the combination to the first group of elements
var subsetValues = [Int](0..<subsetSize)
// For every combination
for _ in 0..<comboCount
{
// Add the new combination
combos.append([Int](subsetValues))
// Update combination, starting with the last
var x = subsetSize - 1
while true
{
// Move to next
subsetValues[x] = subsetValues[x] + 1
// If crossing boundaries, move previous
if (subsetValues[x] > setSize - (subsetSize - x))
{
x -= 1
if x >= 0 { continue }
}
else
{
for x1 in x+1..<subsetSize
{
subsetValues[x1] = subsetValues[x1 - 1] + 1
}
}
break
}
}
return combos
}