Finding higher values from arrays that are all closer than predefined distances - java

I have arrays a1 to an each containing m number of elements. I have another symmetric n X n matrix b containing distance between the arrays. I want to select one element from each array x1 to xn limited to the following constraint. (a1 is an array and x1 a single value taken from a1)
For every xi (which was originally aiu) and xj (which was originally ajv), where i is not same as j, and u and v are the original array indices, we have |u - v| <= bij.
The total sum of x1 to xn is the maximum of all possible such sets.
An example
a1 = [1, 2, 3, 8, -1, -1, 0, -1]
a2 = [1, 2, 4, 0, -1, 1, 10, 11]
b = |0, 2|
|2, 0|
The selected values are x1 = 8 and x2 = 4. One can notice that we didn't select 10 or 11 from the second because the nearest possible value for any of them is just 0.
Now when I have only two arrays I can do the following in java in O(n2) time, I guess, and find the maximum sum, which is 12 in this case. How can I achieve better solution for more than 2 arrays?
int[][] a = new int[][]{{1, 2, 3, 8, -1, -1, 1, -1}, {1, 2, 4, 0, -1, 1, 10, 11}};
int[][] b = new int[][]{{0, 2}, {2, 0}};
int maxVal = Integer.MIN_VALUE;
for (int i = 0; i < a[0].length; i++) {
for (int j = Math.max(i - b[0][1], 0); j < Math.min(a[1].length, i + b[0][1]); j++) {
maxVal = Math.max(maxVal, a[0][i] + a[1][j]);
}
}
System.out.println("The max val: "+maxVal);

You can't use dynamic programming here, because there is no optimal substructure: the b_1n entry can ruin a highly valuable path from x_1 to x_{n-1}. So it's probably hard to avoid exponential time in general. However, for a set of b_ij that do reasonably restrict the choices, there is a straightforward backtracking approach that should have reasonable performance:
At each step, a value has been selected from some of the a_i, but no choice has yet been made from the others. (The arrays selected need not be a prefix of the list, or even contiguous.)
If a choice has been made for every array, return (from this recursive call) the score obtained.
Consider, for each pair of a chosen array and a remaining array, the interval of indices available for selection in the latter given the restriction on distance from the choice made in the former.
Intersect these intervals for each remaining array. If any intersection is empty, reject this proposed set of choices and backtrack.
Otherwise, select the remaining array with the smallest set of choices available. Add each choice to the set of proposed choices and recurse. Return the best score found and the choice made to obtain it, if any, or reject and backtrack.
The identification of the most-constrained array is critical to performance: it constitutes a form of fuzzy belief propagation, efficiently pruning future choices incompatible with present choices necessitated by prior choices. Depending on the sort of input you expect, there might be value in doing further prioritization/pruning based on achievable scores.
My 35-line Python implementation, given a 10x10 random matrix of small integers and b_ij a constant 2, ran in a few seconds. b_ij=3 (which allows up to 7 of the 10 values for each pair of arrays!) took about a minute.

Related

Calculating all 1-off combinations of a set of numbers

The North Carolina Lottery offers several draw games, two of which are Pick 3 and Pick 4. You pick 3 or 4 digits, respectively, between 0 and 9 (inclusive), and the numbers can repeat (e.g., 9-9-9 is a valid combination). I'll use Pick 3 for this example, because it's easier to work with, but I am trying to make this a generic solution to work with any number of numbers.
One of the features of Pick 3 and Pick 4 is "1-OFF," which means you win a prize if at least one of the numbers drawn are 1 up or 1 down from the numbers you have on your ticket.
For example, let's say you played Pick 3 and you picked 5-5-5 for your numbers. At least one number must be 1-off in order to win (so 5-5-5 does not win any prize, if you played the game this way). Winning combinations would be:
1 Number 2 Numbers 3 Numbers
-------- --------- ---------
4-5-5 4-4-5 4-4-4
5-4-5 5-4-4 6-6-6
5-5-4 4-5-4 4-4-6
6-5-5 6-6-5 4-6-6
5-6-5 5-6-6 4-6-4
5-5-6 6-5-6 6-4-4
4-5-6 6-6-4
6-5-4 6-4-6
6-4-5
5-6-4
5-4-6
4-6-5
(I think that's all the combinations, but you get the idea).
The most "efficient" solution I could come up with is to have arrays that define which numbers are altered, and how:
int[][] alterations = {
// 1 digit
{-1, 0, 0}, {0, -1, 0}, {0, 0, -1}, {1, 0, 0}, {0, 1, 0}, {0, 0, 1},
// 2 digits
{-1, -1, 0}, ...
};
And then modify the numbers according to each of the alteration arrays:
int[] numbers = {5, 5, 5};
for(int i = 0; i < alterations.length; i++) {
int[] copy = Arrays.copyOf(numbers, numbers.length);
for(int j = 0; j < alterations[i].length; j++) {
// note: this logic does not account for the numbers 0 and 9:
// 1 down from 0 translates to 9, and 1 up from 9 translates
// to 0, but you get the gist of how this is supposed to work
copy[j] += alterations[i][j];
}
printArray(copy);
}
...
private static void printArray(int[] a) {
String x = "";
for(int i : a)
x += i + " ";
System.out.println(x.trim());
}
But I'm wondering if there's a better way to do this. Has anyone come across something like this and has any better ideas?
Sounds like you're looking for backtracking since constructing the alterations array is quite tedious. In your backtracking algorithm you'd construct your candidates, apply the alteration, and check if the resulting combination is valid, if so then you'd print. I suggest you read Steven Skiena's Algorithms Design Manual Chapter 7 for some background information on backtracking and how it can be done with a combinatorial problem.

Java Recursion to find MAX sum in array

I am trying to figure out a solution to calculate the highest sum of numbers in an array. However, my limitation is that I cannot use adjacent values in the array.
If I am given the array int [] blocks = new int[] {15, 3, 6, 17, 2, 1, 20}; the highest sum calculated is 52 (15+17+20).
My goal is to go from a recursive solution to a solution that uses dynamic programming, however, I am having trouble with the recursive solution.
The base cases that I have initialized:
if(array.length == 0)
return 0;
if(array.length == 1)
return array[0];
After creating the base cases, I am unsure of how to continue the recursive process.
I initially tried to say that if the array was of certain length, then I can calculate the max(Math.max) of the calculations:
e.g. if array.length = 3
return Math.max(array[0], array[1], array[2], array[0]+ array[2])
The problem I then run into is that I could be given an array of length 100.
How can I use recursion in this problem?
I think recursive solution to your problem is (in pseudocode):
maxsum(A,0) = 0
maxsum(A,1) = A[0]
maxsum(A,k) = max(maxsum(A,k-2)+A[k-1], maxsum(A,k-1)), for k >= 2
Where maxsum(A,k) means the maximal sum for a subarray of array A starting from 0 and having length k. I'm sure you'll easily translate that into Java, it translates almost literally.

Why to return low, but not high in this binary search?

Given an unsorted array nums containing n + 1 integers where each
integer is between 1 and n (inclusive), prove that at least one
duplicate number must exist. Assume that there is only one duplicate
number, find the duplicate one.
Note:
You must not modify the array (assume the array is read only).
You must use only constant, O(1) extra space.
There is only one duplicate number in the array, but it could be repeated more than once.
For note 1, we cannot sort the array, for note 2, we cannot use hashing. I think we can use binary search here.
Suppose we have this array with duplicate number 4:
[1, 4(was 2), 3, 4, 5, 6, 4(was 7), 8, 9, 4]
The idea is we are looking at the array through a range filter (like [7,9]), 2 cases could happen:
Case 1: The range contain the duplicated element, in that case, the number of elements we can found in the filter must be larger than the number of elements it should have. For example, if we look at [3, 4], we will find 5 elements. If no duplication occurred, there should be only two [3, 4].
This is true because some other elements could rename into this group, but that cannot rename out. In this case, the expected number of element is [3, 4], but we have one extra 4 (as the duplicate) and then two 4 renamed in, that's why we have 5.
Case 2: The range does not contain the duplicated element, in that case, the number of elements we can find in the filter must be equal or less than the number of elements.
Below is my updated code. I wasn't sure which one to return at last line. Though I tested and found low is correct, I still don't know the reason.
public int findDuplicate(int[] nums) {
int low = 1, high = nums.length - 1;
while(low <= high){
int mid = low + (high - low) / 2;
int count = 0;
//count the number of elements in the filter [low,mid]
for(int i = 0; i < nums.length; i++){
if(nums[i] <= mid && nums[i]>=low){
count++;
}
}
if(count > mid-low+1){ //the duplicate would be in the left half
high = mid;
} else { //the duplicate would be in the right half
low = mid + 1;
}
}
return low; // Why we should return low here, not high?
}
It's not clear why you think you must return low instead of high. I suspect you didn't test this with many different kinds of inputs. For input 1, 1, 2, both high and low will be 0. Whether you return high or low, the answer will be incorrect.
In other words:
the implementation doesn't solve the problem correctly, gives incorrect result
the question "to return lie or high" is the wrong question to ask
The explanation of your algorithm sounds about right. The problem is, you haven't actually implemented what you explained there. You talk about counting elements within a range, adjusting the lower and upper bounds of the range as you go, but in your implementation, you count nums[I] <= mid, so only the upper bound changes (mid), the lower bound is always (implicitly) 0. This doesn't match your explanation. You did not implement your idea correctly.

Find how many times each number between N and M can be expressed as a sum of a pair of primes

Consider this method:
public static int[] countPairs(int min, int max) {
int lastIndex = primes.size() - 1;
int i = 0;
int howManyPairs[] = new int[(max-min)+1];
for(int outer : primes) {
for(int inner : primes.subList(i, lastIndex)) {
int sum = outer + inner;
if(sum > max)
break;
if(sum >= min && sum <= max)
howManyPairs[sum - min]++;
}
i++;
}
return howManyPairs;
}
As you can see, I have to count how many times each number between min and max can be expressed as a sum of two primes.
primes is an ArrayList with all primes between 2 and 2000000. In this case, min is 1000000 and max is 2000000, that's why primes goes until 2000000.
My method works fine, but the goal here is to do something faster.
My method takes two loops, one inside the other, and it makes my algorithm an O(n²). It sucks like bubblesort.
How can I rewrite my code to accomplish the same result with a better complexity, like O(nlogn)?
One last thing: I'm coding in Java, but your reply can be in also Python, VB.Net, C#, Ruby, C or even just a explanation in English.
For each number x between min and max, we want to compute the number of ways x can be written as the sum of two primes. This number can also be expressed as
sum(prime(n)*prime(x-n) for n in xrange(x+1))
where prime(x) is 1 if x is prime and 0 otherwise. Instead of counting the number of ways that two primes add up to x, we consider all ways two nonnegative integers add up to x, and add 1 to the sum if the two integers are prime.
This isn't a more efficient way to do the computation. However, putting it in this form helps us recognize that the output we want is the discrete convolution of two sequences. Specifically, if p is the infinite sequence such that p[x] == prime(x), then the convolution of p with itself is the sequence such that
convolve(p, p)[x] == sum(p[n]*p[x-n] for n in xrange(x+1))
or, substituting the definition of p,
convolve(p, p)[x] == sum(prime(n)*prime(x-n) for n in xrange(x+1))
In other words, convolving p with itself produces the sequence of numbers we want to compute.
The straightforward way to compute a convolution is pretty much what you were doing, but there are much faster ways. For n-element sequences, a fast Fourier transform-based algorithm can compute the convolution in O(n*log(n)) time instead of O(n**2) time. Unfortunately, this is where my explanation ends. Fast Fourier transforms are kind of hard to explain even when you have proper mathematical notation available, and as my memory of the Cooley-Tukey algorithm isn't as precise as I'd like it to be, I can't really do it justice.
If you want to read more about convolution and Fourier transforms, particularly the Cooley-Tukey FFT algorithm, the Wikipedia articles I've just linked would be a decent start. If you just want to use a faster algorithm, your best bet would be to get a library that does it. In Python, I know scipy.signal.fftconvolve would do the job; in other languages, you could probably find a library pretty quickly through your search engine of choice.
What you´re searching is the count of Goldbach partitions for each number
in your range, and imho there is no efficient algorithm for it.
Uneven numbers have 0, even numbers below 4*10^18 are guaranteed to have more than 0,
but other than that... to start with, if even numbers (bigger than 4*10^18) with 0 partitions exist
is an unsolved problem since 1700-something, and such things as exact numbers are even more complicated.
There are some asymptotic and heuristic solutions, but if you want the exact number,
other than getting more CPU and RAM, there isn´t be much you can do.
The other answers have an outer loop that goes from N to M. It's more efficient, however, for the outer loop (or loops) to be pairs of primes, used to build a list of numbers between N and M that equal their sums.
Since I don't know Java I'll give a solution in Ruby for a specific example. That should allow anyone interested to implement the algorithm in Java, regardless of whether they know Ruby.
I initially assume that two primes whose product equals a number between M and N must be unique. In other words, 4 cannot be express as 4 = 2+2.
Use Ruby's prime number library.
require 'prime'
Assume M and N are 5 and 50.
lower = 5
upper = 50
Compute the prime numbers up to upper-2 #=> 48, the 2 being the first prime number.
primes = Prime.each.take_while { |p| p < upper-2 }
#=> [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]
Construct an enumerator of all combinations of two primes.
enum = primes.combination(2)
=> #<Enumerator: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]:combination(2)>
We can see the elements that will be generated by this enumerator by converting it to an array.
enum.to_a
#=> [[2, 3], [2, 5],..., [2, 47], [3, 5],..., [43, 47]] (105 elements)
Just think of enum as an array.
Now construct a counting hash whose keys are numbers between lower and upper for which there is at least one pair of primes that sum to that number and whose values are the numbers of pairs of primes that sum to the value of the associated key.
enum.each_with_object(Hash.new(0)) do |(x,y),h|
sum = x+y
h[sum] += 1 if (lower..upper).cover?(sum)
end
#=> {5=>1, 7=>1, 9=>1, 13=>1, 15=>1, 19=>1, 21=>1, 25=>1, 31=>1, 33=>1,
# 39=>1, 43=>1, 45=>1, 49=>1, 8=>1, 10=>1, 14=>1, 16=>2, 20=>2, 22=>2,
# 26=>2, 32=>2, 34=>3, 40=>3, 44=>3, 46=>3, 50=>4, 12=>1, 18=>2, 24=>3,
# 28=>2, 36=>4, 42=>4, 48=>5, 30=>3, 38=>1}
This shows, for example, that there are two ways that 16 can be expressed as the sum of two primes (3+13 and 5+11), three ways for 34 (3+31, 5+29 and 11+23) and no way for 6.
If the two primes being summed need not be unique (e.g., 4=2+2 is to be included), only a slight change is needed.
arr = primes.combination(2).to_a.concat(primes.zip(primes))
whose sorted values are
a = arr.sort
#=> [[2, 2], [2, 3], [2, 5], [2, 7],..., [3, 3],..., [5, 5],.., [47, 47]] (120 elements)
then
a.each_with_object(Hash.new(0)) do |(x,y),h|
sum = x+y
h[sum] += 1 if (lower..upper).cover?(sum)
end
#=> {5=>1, 7=>1, 9=>1, 13=>1, 15=>1, 19=>1, 21=>1, 25=>1, 31=>1, 33=>1,
# 39=>1, 43=>1, 45=>1, 49=>1, 6=>1, 8=>1, 10=>2, 14=>2, 16=>2, 20=>2,
# 22=>3, 26=>3, 32=>2, 34=>4, 40=>3, 44=>3, 46=>4, 50=>4, 12=>1, 18=>2,
# 24=>3, 28=>2, 36=>4, 42=>4, 48=>5, 30=>3, 38=>2}
a should be replaced by arr. I used a here merely to order the elements of the resulting hash so that it would be easier to read.
Since I just wanted to describe the approach, I used a brute force method to enumerate the pairs of elements of primes, throwing away 44 of the 120 pairs of primes because their sums fall outside the range 5..50 (a.count { |x,y| !(lower..upper).cover?(x+y) } #=> 44). Clearly, there's considerable room for improvement.
A sum of two primes means N = A + B, where A and B are primes, and A < B, which means A < N / 2 and B > N / 2. Note that they can't be equal to N / 2.
So, your outer loop should only loop from 1 to floor((N - 1) / 2). In integer math, the floor is automatic.
Your inner loop can be eliminated if the primes are stored in a Set. Assuming your array is sorted (fair assumption), use a LinkedHashSet, such that iterating the set in the outer loop can stop at (N - 1) / 2.
I'll leave it up to you to code this.
Update
Sorry, the above is an answer to the problem of finding A and B for a particular N. Your question was to find all N between min and max (inclusive).
If you follow to logic of the above, you should be able to apply that to your problem.
Outer loop should be from 1 to max / 2.
Inner loop should be from min - outer to max - outer.
To find the starting point of the inner loop, you can keep some extra index variables around, or you can rely on your prime array being sorted and use Arrays.binarySearch(primes, min - outer). First option is likely a little bit faster, but second option is definitely simpler.

Check Adjacency in Triangle Array

For a school project i had to code the cracker barrel triangle peg game, http://www.joenord.com/puzzles/peggame/3_mid_game.jpg heres a link to what it is. I made a triangle symmetric matrix to represent the board
|\
|0\
|12\
|345\
|6789\....
public int get( int row, int col )
{
if (row >= col) // prevents array out of bounds
return matrix[row][col];
else
return matrix[col][row];
} //
and here is my get() function that's the form of the matrix. if i try to access get(Row, Column) and row>column i access get(column, row) its set that way in all my methods. This way its easier to prevent out of bounds stuff from happening. empty spots in the triangle are set to 0, all pegs are set to 1. There's unrelated reason why i didn't use a Boolean array. The project is a AI project and to develop a heuristic search algorithm i need access to the number of pegs adjacent to each other. I can easily prevent most duplicates by simply dividing by total/2 since it will count every adjacent in both directions. I don't know how to prevent duplicate checks when i cross that middle line. It only matters on the 0 2 5 and 9 positions. If i really wanted to i could write a separate set of rules for those positions, but that doesn't feel like good coding and is not functional for different sized triangles. any input is welcome and if you need more information feel free to ask.
0, 2, 5, 9 is not an arithmetic progression. The finite differences 2-0 = 2, 5-2 = 3, 9 - 5 = 4 are in arithmetic progression. So the sequence is 0, 0 + 2 = 2, 2 + 3 = 5, 5 + 4 = 9, 9 + 5 = 14, 14 + 6 = 20, etc. They are one less than the triangle numbers 1, 3, 6, 10, 15, 21, etc. The nth triangle number has a short-cut expression, n(n+1)/2 (where n starts at 1, not 0). So your numbers are n(n+1)/2 - 1 for n = 1, 2, 3, ...
Anyway, the situation you are experiencing should tell you that setting it up so get(row,col) == get(col,row) is a bad idea. What I would do instead is to set it up so that your puzzle starts at index 1,1 and increases from there; then put special values -1 in the matrix entries 0,y and x,0 and anything with col > row. You can check for out of bounds conditions just by checking for the value -1 in a cell. Then to count the number of pegs surrounding a position you always do the same thing: check all four adjacent cells for 1's.

Categories