Generating all subsets using Gosper's Hack (Bankers sequence) - java

I have a method that generates all subsets of an array, what I want to try and implement is the same sort of method but doing it using binary. Gosper's Hack seems to be the best idea but I have no idea how to implement it. The code below works to generate all subsets.The subsets can be unknown (http://imgur.com/KXflVjq) this shows an output after a couple of seconds of running. Thanks for any advice
int m = prop.length;
int list = (1 << m);
for(long i = 1; i<list; i++) {
final List sub = new ArrayList<>();
for(long j=0; j<m; j++) {
if((i & (1<<j)) > 0) {
sub.add(j);
}
}
Collections.sort(sub);
System.out.println(sub);
}
EDIT: As I have not worded the question correctly, what I need as output is:
2 1 0
0 0 1 = 0
0 1 0 = 1
etc.

First, I'd like to note that it's not clear what exactly is it that you're trying to achieve; please consider clarifying the question. I'll assume that you'd like to generate all k-subsets of an n-set. The problem can be easily reduced to that of generating all k-subsets of {1,2,...,n} (i.e. it suffices to compute all k-subsets of indices).
An algorithm for generating k-subsets of an n-set
A while back I wrote this implementation of a method (which I rediscovered few years ago) for generating all k-subsets of an n-set. Hope it helps. The algorithm essentially visists all binary sequences of length n containing exactly k ones in a clever way (without going through all 2^n sequences); see the accompanying note describing the algorithm, which contains detailed description, pseudocode, and a small step-by-step example.
I think the time complexity is of the order O(k {n choose k}). I do not yet have a formal proof for this. (It is obvious that any algorithm will have to take Omega({n choose k}) time.)
The code in C:
#include <stdlib.h>
#include <stdio.h>
void subs(int n, int k);
int main(int argc, char **argv)
{
if(argc != 3) return 1;
int n, k;
n = atoi(argv[1]); k = atoi(argv[2]);
subs(n, k);
return 0;
}
void subs(int n, int k)
{
int *p = (int *)malloc(sizeof(int)*k);
int i, j, r;
for(i = 0; i < k; ++i) p[i] = i; // initialize our ``set''
// the algorithm
while(1)
{ // visit the current k-subset
for(i = 0; i < k; ++i)
printf("%d ", p[i]+1);
printf("\n");
if(p[0] == n-k) break; // if this is the last k-subset, we are done
for(i = k-1; i >= 0 && p[i]+k-i == n; --i); // find the right element
r = p[i]; ++p[i]; j = 2; // exchange them
for(++i; i < k; ++i, ++j) p[i] = r+j; // move them
}
free(p);
}
References
If this is not efficient enough, I highly recommend Knuth's Volume 4 of The Art of Comouter Programming, where he deals with the problem extensively. It's probably the best reference out there (and fairly recent!).
You might even be able to find a draft of the fascicle, TAOCP Volume 4 Fascicle 3, Generating All Combinations and Partitions (2005), vi+150pp. ISBN 0-201-85394-9, on Knuth's homepage (see his news for 2011 or so).

Related

Given a binary string with all 0's covert it in the target string

Given a binary String which represents the target state. Minimum number of flips needed to convert a same size Binary String (with all 0’s) to target state. A flip also causes all the right bits to be flipped.
e.g.
Input : 00101 (Represents Target)
Output : 3
Explanation :
00000 -> 00111 -> 00100 -> 00101
Two observations:
Flips are commutative. You'll get the same result regardless of what order you do them in.
At some point you have to flip the most significant bit that doesn't match
This gives us a handy greedy argument. We will always get the optimal solution by flipping the leftmost bit that needs to be flipped. At some point we have to flip that bit, and the order doesn't matter so we might as well do it first.
Implementing this to be O(N) can be tricky - if we flip everything naively we end up with an O(N) flip which gives an O(N^2) solution. We can note that in determining the true value of the current bit, we only care about the number of flips that have already occurred. If this number is odd then the value of that bit is flipped. Otherwise it is unchanged.
We can then make a final observation to make life a lot easier:
Flips cancel each other out. Instead of asking how many flips it takes to get from 0 to the target, let's ask how many flips it takes to get from the target to 0. Whenever the true value of a bit is not equal to zero, we simply add a flip.
Pseudocode:
result = 0
// most to least significant
for bit in bits:
if result%2 == 0:
if bit != 0: result += 1
else:
if not bit != 0: result += 1
print(result)
And to be more succinct:
bits = [0, 0, 1, 0, 1]
result = 0
for bit in bits: result += (result%2)^bit
print(result)
Output:
3
Input : 00101 (Represents Target)
Output : 3
Explanation :
00000 -> 00111 -> 00100 -> 00101
static int minFlip(String arr) {
String s = "00000";
int i;
char[] original = s.toCharArray();
char[] bits = arr.toCharArray();
int result = 0;
for (i = 0; i < bits.length;) {
if (bits[i] != original[i]) {
for (int j = i; j < original.length; j++) {
if (original[j] == '0') {
original[j] = '1';
} else {
original[j] = '0';
}
}
result++;
}
i++;
}
return result;
}
Thanks, #Primusa for the answer. Below is the code in Java for #Primusa 's answer:
public static int getFlipsCount(String s) {
int result = 0;
int[] arr = new int[s.length()];
for (int i = 0; i < s.length(); i++) {
arr[i] = Integer.valueOf(s.charAt(i) + "");
}
for (int i = 0; i < arr.length; i++) {
result += (result % 2) ^ arr[i];
}
return result;
}
In case someone wonder how to do it in cpp here's the minimal solution:
#include <iostream>
using namespace std;
int theFinalProblem(string target)
{
int result=0;
for(int i=0;i<target.size();i++)
if(result%2==0 && target[i] || target[i] =='0')
++result;
return result;
}
int main() {
cout<<theFinalProblem("1010");
return 0;
}

How to find the period that most number of calls made where made from a file

I need to create a java project with the following:
I have an input file that holds calls range in milliseconds all over the year – the file is sorted by start time.
for example
0-100 (call started at 0 ended at 100)
1-100 ,
5-50 ,
60-150,
65-180
I need to return the period in the year when the network was the busiest.
in this example the result will be 65-100 because in this period there were 4 calls in the air
I have limited heap memory ( 8GB ) and must run over the file only once
what is the best logic to use here ( in java)
I think the best way to approach this problem would be to define the smallest unit of call. For example in the given situation, you can say that 1 millisecond is the smallest unit of call. Now you start with an array type of data structure that contains one cell for each unit of time. In the given example you have values from 0 to 180, so we will have an array of length 181, starting from 0 to 180.
Now as you encounter the calls, you can start incrementing the relevant cells. For example, when you see a call 0-100, you will increment cells 0 through 100. Similarly, you go through each of the calls. For efficiency sake, you can also cache indexes to largest counts so that you can quickly reach to them.
In your example, the array would look like below.
12222333333333333333333333333333333333333333333333
32222222223333344444444444444444444444444444444444
42222222222222222222222222222222222222222222222222
2111111111111111111111111111111
Here is a sample of c++ code that could compute above array.
#include <iostream>
#include <array>
#include <climits>
struct Call {
Call(int start, int end) {
this->start = start;
this->end = end;
}
int start;
int end;
};
int main(int argc, char* argv[]) {
const Call calls[] = { Call(0, 100),
Call(1, 100),
Call(5, 50),
Call(60, 150),
Call(65, 180)};
int minimum = INT_MAX;
int maximum = INT_MIN;
for (int i = 0; i < sizeof(calls)/sizeof(calls[0]); ++i) {
minimum = std::min(minimum, calls[i].start);
maximum = std::max(maximum, calls[i].end);
}
int *ongoingCalls = new int[maximum + 1];
for (int i = 0; i < sizeof(calls)/sizeof(calls[0]); ++i) {
for (int j = calls[i].start; j <= calls[i].end; ++j) {
++(ongoingCalls[j]);
}
}
for (int i = 0; i < maximum + 1; ++i) {
if (i % 50 == 0) {
std::cout << std::endl;
}
std::cout << ongoingCalls[i];
}
}

Why is my sorting algorithm faster than the one in "Introduction to Algorithms" by H.Cormen book?

I'm studying this book since yesterday and after I've understood and applied the first algorithm, I tried to go on my own and look in a different way. Here's, in Java, the shown algorithm :
public static int[] sort(int[] array)
{
for(int i = 1; i < array.length; i++){
int value = array[i];
int j = i - 1;
while(j >= 0 && array[j] > value){
array[j + 1] = array[j];
j--;
}
array[j+1] = value;
}
return array;
}
And here is mine :
public static int[] sortb(int[] array)
{
for(int i = 0; i < array.length; i++){
int value = array[i];
int j = i;
while(j < array.length && value > array[j]){
array[j] = array[j + 1];
j++;
}
array[j] = value;
}
return array;
}
For 1 million of function call for each, I got 32 ms for the first and 25 ms for the second. I'm still beginning with algorithms, so I have no idea of the meaning.
I found why your sort is so much faster than original one: Because you are not doing sort at all.
In your code
int value = array[i];
int j = i;
while(j < array.length && value > array[j]) { ... }
Because j = i, so value == array[j] before you get into the while loop, and thus your while loop body will never execute. Your sort result will be wrong. That's the main reason why your code is extremely faster.
In my experience (read student's experience) this kind of different values have little meaning.
Maybe you had a background process that took / released a bit more of resources from one to another.
Maybe the specific case you tried to arrange was better for one of the algorythms than the other.
Maybe, if you used different random arrays, one of them was closer to be sorted than the other..
To have good measures, you usually have to do a lot of tests, not only one. For example, generate 1k arrays of 10k elements each and sort each of this array with both algorythms..
Anyway, sometimes specific features of a language or a compiler can generate different results for algorythms with theoretically the exact same complexity (one example: once I noticed in C++ if you traverse a 2-dimensional array first by columns and then by rows, you will have a very different speed than if you do it the other way around; but I don't remember which one was faster tbh).

How would you find how many times one array is repeated in another one?

For example, if you were given {1,2} as the small array and {1,2,3,4,1,2,1,3} as the big one, then it would return 2.
This is probably horribly incorrect:
public static int timesOccur(int[] small, int big[]) {
int sum= 0;
for (int i=0; i<small.length; i++){
int currentSum = 0;
for (int j=0; j<big.length; j++){
if (small[i] == big[j]){
currentSum ++;
}
sum= currentSum ;
}
}
return sum;
}
As #AndyTurner mentioned, your task can be reduced to the set of well-known string matching algorithms.
As I can understand you want solution faster than O(n * m).
There are two main approaches. First involves preprocessing text (long array), second involves preprocessing search pattern (small array).
Preprocessing text. By this I mean creating suffix array or LCP from your longer array. Having this data structure constructed you can perform a binary search to find your your substring. The most efficient time you can achieve is O(n) to build LCP and O(m + log n) to perform the search. So overall time is O(n + m).
Preprocessing pattern. This means construction DFA from the pattern. Having DFA constructed it takes one traversal of the string (long array) to find all occurrences of substring (linear time). The hardest part here is to construct the DFA. Knuth-Morris-Pratt does this in O(m) time, so overall algorithm running time will be O(m + n). Actually KMP algorithm is most probably the best available solution for this task in terms of efficiency and implementation complexity. Check #JuanLopes's answer for concrete implementation.
Also you can consider optimized bruteforce, for example Boyer-Moore, it is good for practical cases, but it has O(n * m) running time in worst case.
UPD:
In case you don't need fast approaches, I corrected your code from description:
public static int timesOccur(int[] small, int big[]) {
int sum = 0;
for (int i = 0; i < big.length - small.length + 1; i++) {
int j = 0;
while (j < small.length && small[j] == big[i + j]) {
j++;
}
if (j == small.length) {
sum++;
}
}
return sum;
}
Pay attention on the inner while loop. It stops as soon as elements don't match. It's important optimization, as it makes running time almost linear for best cases.
upd2: inner loop explanation.
The purpose of inner loop is to find out if smaller array matches bigger array starting from position i. To perform that check index j is iterated from 0 to length of smaller array, comparing the element j of the smaller array with the corresponding element i + j of the bigger array. Loop proceeds when both conditions are true at the same time: j < small.length and corresponding elements of two arrays match.
So loop stops in two situations:
j < small.length is false. This means that j==small.length. Also it means that for all j=0..small.length-1 elements of the two arrays matched (otherwise loop would break earlier, see (2) below).
small[j] == big[i + j] is false. This means that match was not found. In this case loop will break before j reaches small.length
After the loop it's sufficient to check whether j==small.length to know which condition made loop to stop and hence know whether match was found or not for current position i.
This is a simple subarray matching problem. In Java you can use Collections.indexOfSublist, but you would have to box all the integers in your array. An option is to implement your own array matching algorithm. There are several options, most string searching algorithms can be adapted to this task.
Here is an optimized version based on the KMP algorithm. In the worst case it will be O(n + m), which is better than the trivial algorithm. But it has the downside of requiring extra space to compute the failure function (F).
public class Main {
public static class KMP {
private final int F[];
private final int[] needle;
public KMP(int[] needle) {
this.needle = needle;
this.F = new int[needle.length + 1];
F[0] = 0;
F[1] = 0;
int i = 1, j = 0;
while (i < needle.length) {
if (needle[i] == needle[j])
F[++i] = ++j;
else if (j == 0)
F[++i] = 0;
else
j = F[j];
}
}
public int countAt(int[] haystack) {
int count = 0;
int i = 0, j = 0;
int n = haystack.length, m = needle.length;
while (i - j <= n - m) {
while (j < m) {
if (needle[j] == haystack[i]) {
i++;
j++;
} else break;
}
if (j == m) count++;
else if (j == 0) i++;
j = F[j];
}
return count;
}
}
public static void main(String[] args) {
System.out.println(new KMP(new int[]{1, 2}).countAt(new int[]{1, 2, 3, 4, 1, 2, 1, 3}));
System.out.println(new KMP(new int[]{1, 1}).countAt(new int[]{1, 1, 1}));
}
}
Rather than posting a solution I'll provide some hints to get your moving.
It's worth breaking the problem down into smaller pieces, in general your algorithm should look like:
for each position in the big array
check if the small array matches that position
if it does, increment your counter
The smaller piece is then checking if the small array matches a given position
first check if there's enough room to fit the smaller array
if not then the arrays don't match
otherwise for each position in the smaller array
check if the values in the arrays match
if not then the arrays don't match
if you get to the end of the smaller array and they have all matched
then the arrays match
Though not thoroughly tested I believe this is a solution to your problem. I would highly recommend using Sprinters pseudocode to try and figure this out yourself before using this.
public static void main(String[] args)
{
int[] smallArray = {1,1};
int[] bigArray = {1,1,1};
int sum = 0;
for(int i = 0; i < bigArray.length; i++)
{
boolean flag = true;
if(bigArray[i] == smallArray[0])
{
for(int x = 0; x < smallArray.length; x++)
{
if(i + x >= bigArray.length)
flag = false;
else if(bigArray[i + x] != smallArray[x])
flag = false;
}
if(flag)
sum += 1;
}
}
System.out.println(sum);
}
}

Finding smallest element in an integer array in java using divide and conquor algorithm

I tried to find the smallest element in an integer array using what i understood about divide and conquor algorithm.
I am getting correct results.
But i am not sure if it is a conventional way of using divide and conquor algorithm.
If there is any other smarter way of implementing divide and conquor algorithm than what i have tried then please let me know it.
public static int smallest(int[] array){
int i = 0;
int array1[] = new int[array.length/2];
int array2[] = new int[array.length - (array.length/2)];
for(int index = 0; index < array.length/2 ; index++){
array1[index] = array[index];
}
for(int index = array.length/2; index < array.length; index++){
array2[i] = array[index];
i++;
}
if(array.length > 1){
if(smallest(array1) < smallest(array2)){
return smallest(array1);
}else{
return smallest(array2);
}
}
return array[0];
}
Your code is correct, but You can write less code using existing functions like Arrays.copyOfRange and Math.min
public static int smallest(int[] array) {
if (array.length == 1) {
return array[0];
}
int array1[] = Arrays.copyOfRange(array, 0, array.length / 2);
int array2[] = Arrays.copyOfRange(array, array.length / 2, array.length);
return Math.min(smallest(array1), smallest(array2));
}
Another point. Testing for the length == 1 at the beginning is more readable version. Functionally it is identical. From a performance point of view it creates less arrays, exiting as soon as possible from the smallest function.
It is also possible to use a different form of recursion where it is not necessary to create new arrays.
private static int smallest(int[] array, int from, int to) {
if (from == to) {
return array[from];
}
int middle = from + (to - from) / 2;
return Math.min(smallest(array, from, middle), smallest(array, middle + 1, to));
}
public static int smallest(int[] array){
return smallest(array, 0, array.length - 1);
}
This second version is more efficient because it doesn't creates new arrays.
I don't find any use in using a divide and conquer in this paticular program.
Anyhow you search for the whole array from 1 to N, but in two steps
1. 1 to N / 2
2. N / 2 + 1 to N
This is equivalent to 1 to N.
Also you program check for few additional checks after the loops which aren't actually required when you do it directly.
int min = a[0];
for(int i = 1; i < arr.length; i++)
if(min < a[i])
a[i] = min;
This is considered most efficient in finding out the minimum value.
When do I use divide and conquer
A divide and conquer algorithm works by recursively breaking down a problem into two or more sub-problems, until these become simple enough to be solved directly.
Consider the Merge Sort Algorithm.
Here, we divide the problem step by step untill we get smaller problem and then we combine them to sort them. In this case this is considered optimal. The normal runs in a O(n * n) and this runs in O(n log n).
But in finding the minimum the original has O(n). So this is good.
Divide And Conquer
The book
Data Structures and Algorithm Analysis in Java, 2nd edtition, Mark Allen Weiss
Says that a D&C algorithm should have two disjoint recursive calls. I.e like QuickSort. The above algorithm does not have this, even if it can be implemented recursively.
What you did here with code is correct. But there are more efficient ways of solving this code, of which i'm sure you're aware of.
Although divide and conquer algorithm can be applied to this problem, but it is more suited for complex data problem or to understand a difficult data problem by dividing it into smaller fragments. One prime example would be 'Tower of Hanoi'.
As far as your code is concerned, it is correct. Here's another copy of same code-
public class SmallestInteger {
public static void main(String[] args) {
int small ;
int array[] = {4,-2,8,3,56,34,67,84} ;
small = smallest(array) ;
System.out.println("The smallest integers is = " + small) ;
}
public static int smallest(int[] array) {
int array1[] = new int[array.length/2];
int array2[] = new int[array.length - (array.length/2)];
for (int index = 0; index < array.length/2 ; index++) {
array1[index] = array[index];
}
for (int index = array.length/2; index < array.length; index++) {
array2[index - array.length/2] = array[index] ;
}
if (array.length > 1) {
if(smallest(array1) < smallest(array2)) {
return smallest(array1) ;
}
else {
return smallest(array2) ;
}
}
return array[0] ;
}
}
Result came out to be-
The smallest integers is = -2

Categories