What's wrong when my matrix is "not positive definite"? - java

i'm using the Jahmm java lib for classification. I want to do some testing so i generate some random data sets.
I create data sets in this format:
[val_1.1 val_1.2 val_1.3];[val_2.1 val_2.2 val_2.3]; [val_3.1 val_3.2 val_3.3] etc...
i use a random function so that
val_1.1 == val_1.2 == val_1.3
and
val_2.1 == val_2.2 == val_2.3
and so on.
When i call the following function with this dataset it throws an IllegalArgumentException
static double[][] decomposeCholesky(double[][] m)
{
if (!isSquare(m))
throw new IllegalArgumentException("Matrix is not square");
double[][] l = matrix(nbRows(m), nbColumns(m));
for (int j = 0; j < nbRows(m); j++)
{
double[] lj = l[j];
double d = 0.;
for (int k = 0; k < j; k++) {
double[] lk = l[k];
double s = 0.;
for (int i = 0; i < k; i++)
s += lk[i] * lj[i];
lj[k] = s = (m[j][k] - s) / l[k][k];
d = d + s * s;
}
if ((d = m[j][j] - d) <= 0.)
throw new IllegalArgumentException("Matrix is not positive " +
"defined");
l[j][j] = Math.sqrt(d);
for (int k = j+1; k < nbRows(m); k++)
l[j][k] = 0.;
}
return l;
}
so my sequence matrix is not "positive defined", but what does it mean? And what should do to my dataset to avoid it?
I am not good in math! Thanks in advance

I think the author of the code meant "positive definite." A matrix must be positive definite in order to be factored with a Cholesky decomposition. The formal definition is a square matrix A is positive definite if and only if, for all vectors x:
x'Ax > 0
All positive definite matrices are symmetric about the diagonal, and square, so a good start would be to use only square symmetric matrices in the test and see how that works. To absolutely ensure that a matrix is positive definite, you can test all its eigenvalues to see if they are each > 0. I don't know if JAHMM has methods for taking the eigenvalues of a matrix but if so, you could do that.

i do not know that library,
but i think you mean "positive definite" not "positive defined".
here's the thing: if you have a normal number you can easily tell if it's positive oder negative (or zero) by looking at the sign. definiteness is an extension of that idea into the world of matrices, where just looking at the sign doesn't work anymore because some entries might be positive and some might be negative.
there are many different definitions of definiteness (which can proven to be equal), you can find them neatly listed here:
http://en.wikipedia.org/wiki/Positive-definite_matrix#Characterizations
now, the problem is that when you choose your rows equally that doesn't guarantee
positive definiteness. in fact that 3x3 matrix will always be positive semidefinite and never positive definite.
i've looked around a bit; here are a few hints how to generate positive definite matrices:
https://ece.uwaterloo.ca/~dwharder/NumericalAnalysis/04LinearAlgebra/posdef/
( rand(n, n) + (n - 1)*eye( n ) )
so you generate a n x n matrix where all entries are random between 0 and 1,
then add the identity matrix multiplied with n-1, in your case that is [2,0,0];[0,2,0];...
hope that helps ...
p.s. i forgot your matrix has to be symmetric as well because you'd like to do a cholesky on it. but that's easy, just generate the matrix A as mentioned above, then choose B = 1/2 * ( A + A.transposed() );
This matrix B will still be positive definite and it'll be symmetric as well :)

Related

Finding minimal "factorization" of an int to square-numbers

The problem I am trying to solve:
Given an int n, return the minimal "factorization" of this int to numbers which are all squares.
We define factorization here not in the usual manner: a factorization of k to m numbers (m1, m2, m3...) will be such that: m1 + m2 + m3 + ... = k.
For example: let n = 12. The optimal solution is: [4,4,4] since 4 is the square of 2 and 4 + 4 + 4 = 12. There is also [9,1,1,1] though it is not minimal since it's 4 numbers instead of 3 in the former.
My attempt to solve this:
My idea was given the number n we will perform the following algorithm:
First we will find the closest square number to n (for example if n = 82 we will find 81.
Then we will compute, recursively, the number we got minus the square closest to it.
Here is a flow example: assume n = 12 and our function is f, we compute f(3) UNION {9} and then f(12-4) UNION {4} and then f(12-2) UNION {2}. From each we get a list of square combinations, we take the minimal list from those. We save those in a HashMap to avoid duplications (dynamic-programming style).
Code attempt in Java (incomplete):
public List<Integer> getShortestSquareList(int n){
HashMap<Integer,List<Integer>> map = new HashMap<Integer,List<Integer>();
map.put(1, 1);
List<Integer> squareList = getSquareList(n);
return internalGetShortestSquareList(n, map, squareList);
}
List<Integer> getSquareList(int n){
List<Integer> result=new ArrayList<Integer>();
int i = 1;
while(i*i <= n){
result.add(i*i);
i++;
}
return result;
}
public int getClosestSquare(int n,List<Integer> squareList){
// getting the closestSquareIndex
}
public List<Integer> internalGetShortestSquareList(int n, HashMap<Integer m, HashMap<Integer,List<Integer>> map, List<Integer> squareList){
if (map.contains(n)) {return map.get(n);}
int closestSqureIndex=getClosestSquare(m,squareList);
List<Integer> minSquareList;
int minSize=Integer.MAX_INT;
for(int i=closestSqureIndex; i>-1; i--) {
int square = squareList.get(closestSqureIndex);
List<Integer> tempSquares= new ArrayList<Integer>(square);
tempSquares.addAll(f(n-square, map, squareList));
if (tempSquares.size() < minSize) {
minSize = tempSize;
minSquareList = tempSquares;
}
}
map.put(n, minSquareList);
return map.get(n);
}
My question:
It seems that my solution is not optimal (imo). I think that the time complexity for my solution is O(n)*O(Sqrt(n)) since the maximal recursion depth is n and the maximum number of children is Sqrt(n). My solution is probably full of bugs - which doesn't matter to me at the moment. I will appreciate any guidance to find a more optimal solution (pseudo-code or otherwise).
Based on #trincot's link, I would suggest a simple O(n sqrt n) algorithm. The idea is :
Use exhaustive search on the squares smaller or equal to n to find out if n is a square itself, or a sum of any two or three squares less than n. This can be done in sqrt(n)^3 time, which is O(n sqrt n).
If this fails, then find a "factorization" of n in four squares.
To recursively find 4-factorization of a number m, there are three cases now:
m is a prime number and m mod 4 = 1. According to the math, we know that n is a product of two squares. Both simple exhaustive search or more "mathy" methods should give an easy answer.
m is a prime number and m mod 4 = 3. This case still requires working out the details, but could be implemented using the math described in the link.
m is a composite number. This is the recursive case. First factorize m in two factors, i.e. integers u and v so that u*v=m. For performance reasons, they should be as close as possible, but this is a minor detail.
Afterwards, recursively find the 4-factorization of u and v.
Then, using the formula:
(a^2+b^2+c^2+d^2) (A^2+B^2+C^2+D^2) = (aA+bB+cC+dD)^2 + (aB-bA+cD-dC)^2 + (aC-bD-cA+dB)^2 + (aD-dA+bC-cB)^2
find the 4-factorization of m. Here I denoted u = (a^2+b^2+c^2+d^2) and v = (A^2+B^2+C^2+D^2), as their 4-factorization is known at this point.
Much simpler solution:
This is a version of the Coin Change problem.
You can call the following method with coins as the list of the square number that smaller than amount (n in your example).
Example: amount=12 , coins={1,2,4,9}
public int coinChange(int[] coins, int amount) {
int max = amount + 1;
int[] dp = new int[amount + 1];
Arrays.fill(dp, max);
dp[0] = 0;
for (int i = 1; i <= amount; i++) {
for (int j = 0; j < coins.length; j++) {
if (coins[j] <= i) {
dp[i] = Math.min(dp[i], dp[i - coins[j]] + 1);
}
}
}
return dp[amount] > amount ? -1 : dp[amount];
}
The complexity of it is O(n*m) where m is the number of coins. So in your example it the same complexity like you mention O(n*sqrt(n))
It solved with Dynamic programming - Bottom up approch.
The code has been taken from here.

Reduce treatment time of the FFT

I'm currently working on Java for Android. I try to implement the FFT in order to realize a kind of viewer of the frequencies.
Actually I was able to do it, but the display is not fluid at all.
I added some traces in order to check the treatment time of each part of my code, and the fact is that the FFT takes about 300ms to be applied on my complex array, that owns 4096 elements. And I need it to take less than 100ms, as my thread (that displays the frequencies) is refreshed every 100ms. I reduced the initial array in order that the FFT results own only 1028 elements, and it works, but the result is deprecated.
Does someone have an idea ?
I used the default fft.java and Complex.java classes that can be found on the internet.
For information, my code computing the FFT is the following :
int bytesPerSample = 2;
Complex[] x = new Complex[bufferSize/2] ;
for (int index = 0 ; index < bufferReadResult - bytesPerSample + 1; index += bytesPerSample)
{
// 16BITS = 2BYTES
float asFloat = Float.intBitsToFloat(asInt);
double sample = 0;
for (int b = 0; b < bytesPerSample; b++) {
int v = buffer[index + b];
if (b < bytesPerSample - 1 || bytesPerSample == 1) {
v &= 0xFF;
}
sample += v << (b * 8);
}
double sample32 = 100 * (sample / 32768.0); // don't know the use of this compute...
x[index/bytesPerSample] = new Complex(sample32, 0);
}
Complex[] tx = new Complex[1024]; // size = 2048
///// reduction of the size of the signal in order to improve the fft traitment time
for (int i = 0; i < x.length/4; i++)
{
tx[i] = new Complex(x[i*4].re(), 0);
}
// Signal retrieval thanks to the FFT
fftRes = FFT.fft(tx);
I don't know Java, but you're way of converting between your input data and an array of complex values seems very convoluted. You're building two arrays of complex data where only one is necessary.
Also it smells like your complex real and imaginary values are doubles. That's way over the top for what you need, and ARMs are veeeery slow at double arithmetic anyway. Is there a complex class based on single precision floats?
Thirdly you're performing a complex fft on real data by filling the imaginary part of your complexes with zero. Whilst the result will be correct it is twice as much work straight off (unless the routine is clever enough to spot that, which I doubt). If possible perform a real fft on your data and save half your time.
And then as Simon says there's the whole issue of avoiding garbage collection and memory allocation.
Also it looks like your FFT has no preparatory step. This mean that the routine FFT.fft() is calculating the complex exponentials every time. The longest part of the FFT calculation is working out the complex exponentials, which is a shame because for any given FFT length the exponentials are constants. They don't depend on your input data at all. In the real time world we use FFT routines where we calculate the exponentials once at the start of the program and then the actual fft itself takes that const array as one of its inputs. Don't know if your FFT class can do something similar.
If you do end up going to something like FFTW then you're going to have to get used to calling C code from your Java. Also make sure you get a version that supports (I think) NEON, ARM's answer to SSE, AVX and Altivec. It's worth ploughing through their release notes to check. Also I strongly suspect that FFTW will only be able to offer a significant speed up if you ask it to perform an FFT on single precision floats, not doubles.
Google luck!
--Edit--
I meant of course 'good luck'. Give me a real keyboard quick, these touchscreen ones are unreliable...
First, thanks for all your answers.
I followed them and made two test :
first one, I replace the double used in my Complex class by float. The result is just a bit better, but not enough.
then I've rewroten the fft method in order not to use Complex anymore, but a two-dimensional float array instead. For each row of this array, the first column contains the real part, and the second one the imaginary part.
I also changed my code in order to instanciate the float array only once, on the onCreate method.
And the result... is worst !! Now it takes a little bit more than 500ms instead of 300ms.
I don't know what to do now.
You can find below the initial fft fonction, and then the one I've re-wroten.
Thanks for your help.
// compute the FFT of x[], assuming its length is a power of 2
public static Complex[] fft(Complex[] x) {
int N = x.length;
// base case
if (N == 1) return new Complex[] { x[0] };
// radix 2 Cooley-Tukey FFT
if (N % 2 != 0) { throw new RuntimeException("N is not a power of 2 : " + N); }
// fft of even terms
Complex[] even = new Complex[N/2];
for (int k = 0; k < N/2; k++) {
even[k] = x[2*k];
}
Complex[] q = fft(even);
// fft of odd terms
Complex[] odd = even; // reuse the array
for (int k = 0; k < N/2; k++) {
odd[k] = x[2*k + 1];
}
Complex[] r = fft(odd);
// combine
Complex[] y = new Complex[N];
for (int k = 0; k < N/2; k++) {
double kth = -2 * k * Math.PI / N;
Complex wk = new Complex(Math.cos(kth), Math.sin(kth));
y[k] = q[k].plus(wk.times(r[k]));
y[k + N/2] = q[k].minus(wk.times(r[k]));
}
return y;
}
public static float[][] fftf(float[][] x) {
/**
* x[][0] = real part
* x[][1] = imaginary part
*/
int N = x.length;
// base case
if (N == 1) return new float[][] { x[0] };
// radix 2 Cooley-Tukey FFT
if (N % 2 != 0) { throw new RuntimeException("N is not a power of 2 : " + N); }
// fft of even terms
float[][] even = new float[N/2][2];
for (int k = 0; k < N/2; k++) {
even[k] = x[2*k];
}
float[][] q = fftf(even);
// fft of odd terms
float[][] odd = even; // reuse the array
for (int k = 0; k < N/2; k++) {
odd[k] = x[2*k + 1];
}
float[][] r = fftf(odd);
// combine
float[][] y = new float[N][2];
double kth, wkcos, wksin ;
for (int k = 0; k < N/2; k++) {
kth = -2 * k * Math.PI / N;
//Complex wk = new Complex(Math.cos(kth), Math.sin(kth));
wkcos = Math.cos(kth) ; // real part
wksin = Math.sin(kth) ; // imaginary part
// y[k] = q[k].plus(wk.times(r[k]));
y[k][0] = (float) (q[k][0] + wkcos * r[k][0] - wksin * r[k][1]);
y[k][1] = (float) (q[k][1] + wkcos * r[k][1] + wksin * r[k][0]);
// y[k + N/2] = q[k].minus(wk.times(r[k]));
y[k + N/2][0] = (float) (q[k][0] - (wkcos * r[k][0] - wksin * r[k][1]));
y[k + N/2][1] = (float) (q[k][1] - (wkcos * r[k][1] + wksin * r[k][0]));
}
return y;
}
actually I think I don't understand everything.
First, about Math.cos and Math.sin : how do you want me not to compute it each time ? Do you mean that I should instanciate the whole values only once (e.g store it in an array) and use them for each compute ?
Second, about the N % 2, indeed it's not very useful, I could make the test before the call of the function.
Third, about Simon's advice : I mixed what he said and what you said, that's why I've replaced the Complex by a two-dimensional float[][]. If that was not what he suggested, then what was it ?
At least, I'm not a FFT expert, so what do you mean by making a "real FFT" ? Do you mean that my imaginary part is useless ? If so, I'm not sure, because later in my code, I compute the magnitude of each frequence, so sqrt(real[i]*real[i] + imag[i]*imag[i]). And I think that my imaginary part is not equal to zero...
thanks !

Calculating Eulers Totient Function for very large numbers JAVA

I've managed to get a version of Eulers Totient Function working, albeit one that works for smaller numbers (smaller here being smaller compared to the 1024 bit numbers I need it to calculate)
My version is here -
public static BigInteger eulerTotientBigInt(BigInteger calculate) {
BigInteger count = new BigInteger("0");
for(BigInteger i = new BigInteger("1"); i.compareTo(calculate) < 0; i = i.add(BigInteger.ONE)) {
BigInteger check = GCD(calculate,i);
if(check.compareTo(BigInteger.ONE)==0) {//coprime
count = count.add(BigInteger.ONE);
}
}
return count;
}
While this works for smaller numbers, it works by iterating through every possible from 1 to the number being calculated. With large BigIntegers, this is totally unfeasible.
I've read that it's possible to divide the number on each iteration, removing the need to go through them one by one. I'm just not sure what I'm supposed to divide by what (some of the examples I've looked at are in C and use longs and a square root - as far as I know I can't calculate an accurate an accurate square root of a BigInteger. I'm also wondering that if for modular arithmetic such as this, does the function need to include an argument stating what the mod is. I'm totally unsure on that so any advice much appreciated.
Can anyone point me in the right direction here?
PS I deleted this question when I found modifying Euler Totient Function. I adapted it to work with BigIntegers -
public static BigInteger etfBig(BigInteger n) {
BigInteger result = n;
BigInteger i;
for(i = new BigInteger("2"); (i.multiply(i)).compareTo(n) <= 0; i = i.add(BigInteger.ONE)) {
if((n.mod(i)).compareTo(BigInteger.ZERO) == 0)
result = result.divide(i);
while(n.mod(i).compareTo(BigInteger.ZERO)== 0 )
n = n.divide(i);
}
if(n.compareTo(BigInteger.ONE) > 0)
result = result.subtract((result.divide(n)));
return result;
}
And it does give an accurate result, bit when passed a 1024 bit number it runs forever (I'm still not sure if it even finished, it's been running for 20 minutes).
There is a formula for the totient function, which required the prime factorization of n.
Look here.
The formula is:
phi(n) = n * (p1 - 1) / p1 * (p2 - 1) / p2 ....
were p1, p2, etc. are all the prime divisors of n.
Note that you only need BigInteger, not floating point, because the division is always exact.
So now the problem is reduced to finding all prime factors, which is better than iteration.
Here is the whole solution:
int n; //this is the number you want to find the totient of
int tot = n; //this will be the totient at the end of the sample
for (int p = 2; p*p <= n; p++)
{
if (n%p==0)
{
tot /= p;
tot *= (p-1);
while ( n % p == 0 )
n /= p;
}
}
if ( n > 1 ) { // now n is the largest prime divisor
tot /= n;
tot *= (n-1);
}
The algorithm you are trying to write is equivalent to factoring the argument n, which means you should expect it to run forever, practically speaking until either your computer dies or you die. See this post in mathoverflow for more information: How hard is it to compute the Euler totient function?.
If, on the other hand, you want the value of the totient for some large number for which you have the factorization, pass the argument as sequence of (prime, exponent) pairs.
The etfBig method has a problem.
Euler's product formula is n*((factor-1)/factor) for all factors.
Note: Petar's code has it as:
tot /= p;
tot *= (p-1);
In the etfBig method, replace result = result.divide(i);
with
result = result.multiply(i.subtract(BigInteger.ONE)).divide(i);
Testing from 2 to 200 then produces the same results as the regular algorithm.

Generate random numbers in increments

I need to generate n random numbers between a and b, but any two numbers cannot have a difference of less than c. All variables except n are floats (n is an int).
Solutions are preferred in java, but C/C++ is okay too.
Here is what code I have so far.:
static float getRandomNumberInRange(float min, float max) {
return (float) (min + (Math.random() * (max - min)));
}
static float[] randomNums(float a, float b, float c, int n) {
float minDistance = c;
float maxDistance = (b - a) - (n - 1) * c;
float[] randomNumArray = new float[n];
float random = getRandomNumberInRange(minDistance, maxDistance);
randomNumArray[0] = a + random;
for (int x = 1; x < n; x++) {
maxDistance = (b - a) - (randomNumArray[x - 1]) - (n - x - 1) * c;
random = getRandomNumberInRange(minDistance, maxDistance);
randomNumArray[x] = randomNumArray[x - 1] + random;
}
return randomNumArray;
}
If I run the function as such (10 times), I get the following output:
Input: randomNums(-1f, 1f, 0.1f, 10)
[-0.88, 0.85, 1.23, 1.3784, 1.49, 1.59, 1.69, 1.79, 1.89, 1.99]
[-0.73, -0.40, 0.17, 0.98, 1.47, 1.58, 1.69, 1.79, 1.89, 1.99]
[-0.49, 0.29, 0.54, 0.77, 1.09, 1.56, 1.69, 1.79, 1.89, 1.99]
I think a reasonable approach can be the following:
Total "space" is (b - a)
Remove the minimum required space (n-1)*c to obtain the remaining space
Shot (n-1) random numbers between 0 and 1 and scale them so that the sum is this just computed "optional space". Each of them will be a "slice" of space to be used.
First number is a
For each other number add c and the next "slice" to the previous number. Last number will be b.
If you don't want first and last to match a and b exactly then just create n+1 slices instead of n-1 and start with a+slice[0] instead of a.
The main idea is that once you remove the required spacing between the points (totalling (n-1)*c) the problem is just to find n-1 values so that the sum is the prescribed "optional space". To do this with a uniform distribution just shoot n-1 numbers, compute the sum and uniformly scale those numbers so that the sum is instead what you want by multiplying each of them by the constant factor k = wanted_sum / current_sum.
To obtain the final result you just use as spacing between a value and the previous one the sum of the mandatory part c and one of the randomly sampled variable parts.
An example in Python of the code needed for the computation is the following
space = b - a
slack = space - (n - 1)*c
slice = [random.random() for i in xrange(n-1)] # Pick (n-1) random numbers 0..1
k = slack / sum(slice) # Compute needed scaling
slice = [x*k for x in slice] # Scale to get slice sizes
result = [a]
for i in xrange(n-1):
result.append(result[-1] + slice[i] + c)
If you have random number X and you want another random number Y which is a minimum of A from X and a maximum of B from X, why not write that in your code?
float nextRandom(float base, float minDist, float maxDist) {
return base + minDist + (((float)Math.random()) * (maxDist - minDist));
}
by trying to keep the base out of the next number routine, you add a lot of complexity to your algorithm.
Though this does not exactly do what you need and does not incorporate the techinque being described in this thread, I believe that this code will prove to be useful as it will do what it seems like you want.
static float getRandomNumberInRange(float min, float max)
{
return (float) (min + (Math.random() * ((max - min))));
}
static float[] randomNums(float a, float b, float c, int n)
{
float averageDifference=(b-a)/n;
float[] randomNumArray = new float[n];
int random;
randomNumArray[0]=a+averageDifference/2;
for (int x = 1; x < n; x++)
randomNumArray[x]=randomNumArray[x-1]+averageDifference;
for (int x = 0; x < n; x++)
{
random = getRandomNumberInRange(-averageDifference/2, averageDifference/2);
randomNumArray[x]+=random;
}
return randomNumArray;
}
I need to generate n random numbers between a and b, but any two numbers cannot have a difference of less than c. All variables except n are floats (n is an int).
Solutions are preferred in java, but C/C++ is okay too.
First, what distribution? I'm going to assume a uniform distribution, but with that caveat that "any two numbers cannot have a difference of less than c". What you want is called "rejection sampling". There's a wikipedia article on the subject, plus a whole lot of other references on the 'net and in books (e.g. http://www.columbia.edu/~ks20/4703-Sigman/4703-07-Notes-ARM.pdf). Pseudocode, using some function random_uniform() that returns a random number drawn from U[0,1], and assuming a 1-based array (many languages use a 0-based array):
function generate_numbers (a, b, c, n, result)
result[1] = a + (b-a)*random_uniform()
for index from 2 to n
rejected = true
while (rejected)
result[index] = a + (b-a)*random_uniform()
rejected = abs (result[index] < result[index-1]) < c
end
end
Your solution was almost correct, here is the fix:
maxDistance = b - (randomNumArray[x - 1]) - (n - x - 1) * c;
I would do this by just generating n random numbers between a and b. Then I would sort them and get the first order differences, kicking out any numbers that generate a difference less than c, leaving me with m numbers. If m < n, I would just do it again, this time for n - m numbers, add those numbers to my original results, sort again, generate differences...and so on until I have n numbers.
Note, first order differences means x[1] - x[0], x[2] - x[1] and so on.
I don't have time to write this out in C but in R, it's pretty easy:
getRands<-function(n,a,b,c){
r<-c()
while(length(r) < n){
r<-sort(c(r,runif(n,a,b)))
r<-r[-(which(diff(r) <= c) + 1 )]
}
r
}
Note that if you are too aggresive with c relative to a and b, this kind of solution might take a long time to converge, or not converge at all if n * C > b -a
Also note, I don't mean for this R code to be a fully formed, production ready piece of code, just an illustration of the algorithm (for those who can follow R).
How about using a shifting range as you generate numbers to ensure that they don't appear too close?
static float[] randomNums(float min, float max, float separation, int n) {
float rangePerNumber = (max - min) / n;
// Check separation and range are consistent.
assert (rangePerNumber >= separation) : "You have a problem.";
float[] randomNumArray = new float[n];
// Set range for first random number
float lo = min;
float hi = lo + rangePerNumber;
for (int i = 0; i < n; ++i) {
float random = getRandomNumberInRange(lo, hi);
// Shift range for next random number.
lo = random + separation;
hi = lo + rangePerNumber;
randomNumArray[i] = random;
}
return randomNumArray;
}
I know you already accepted an answer, but I like this problem. I hope it's unique, I haven't gone through everyone's answers in detail just yet, and I need to run, so I'll just post this and hope it helps.
Think of it this way: Once you pick your first number, you have a chunk +/- c that you can no longer pick in.
So your first number is
range1=b-a
x=Random()*range1+a
At this point, x is somewhere between a and b (assuming Random() returns in 0 to 1). Now, we mark out the space we can no longer pick in
excludedMin=x-c
excludedMax=x+c
If x is close to either end, then it's easy, we just pick in the remaining space
if (excludedMin<=a)
{
range2=b-excludedMax
y=Random()*range2+excludedMax
}
Here, x is so close to a, that you won't get y between a and x, so you just pick between x+c and b. Likewise:
else if (excludedMax>=b)
{
range2=excludedMin-a
y=Random()*range2+a
}
Now if x is somewhere in the middle, we have to do a little magic
else
{
range2=b-a-2*c
y=Random()*range2+a
if (y>excludedMin) y+=2*c
}
What's going on here? Well, we know that the range y can lie in, is 2*c smaller than the whole space, so we pick a number somewhere in that smaller space. Now, if y is less than excludedMin, we know y "is to the left" of x-c, and we're all ok. However, if y>excluded min, we add 2*c (the total excluded space) to it, to ensure that it's greater than x+c, but it'll still be less than b because our range was reduced.
Now, it's easy to expand so n numbers, each time you just reduce the range by the excluded space among any of the other points. You continue until the excluded space equals the original range (b-a).
I know it's bad form to do a second answer, but I just thought of one...use a recursive search of the space:
Assume a global list of points: points
FillRandom(a,b,c)
{
range=b-a;
if (range>0)
{
x=Random()*range+a
points.Append(x)
FillRandom(a,x-c,c)
FillRandom(x+c,b,c)
}
}
I'll let you follow the recursion, but at the end, you'll have a list in points that fills the space with density 1/c

Generating N numbers that sum to 1

Given an array of size n I want to generate random probabilities for each index such that Sigma(a[0]..a[n-1])=1
One possible result might be:
0 1 2 3 4
0.15 0.2 0.18 0.22 0.25
Another perfectly legal result can be:
0 1 2 3 4
0.01 0.01 0.96 0.01 0.01
How can I generate these easily and quickly? Answers in any language are fine, Java preferred.
Get n random numbers, calculate their sum and normalize the sum to 1 by dividing each number with the sum.
The task you are trying to accomplish is tantamount to drawing a random point from the N-dimensional unit simplex.
http://en.wikipedia.org/wiki/Simplex#Random_sampling might help you.
A naive solution might go as following:
public static double[] getArray(int n)
{
double a[] = new double[n];
double s = 0.0d;
Random random = new Random();
for (int i = 0; i < n; i++)
{
a [i] = 1.0d - random.nextDouble();
a [i] = -1 * Math.log(a[i]);
s += a[i];
}
for (int i = 0; i < n; i++)
{
a [i] /= s;
}
return a;
}
To draw a point uniformly from the N-dimensional unit simplex, we must take a vector of exponentially distributed random variables, then normalize it by the sum of those variables. To get an exponentially distributed value, we take a negative log of uniformly distributed value.
This is relatively late, but to show the ammendment to #Kobi's simple and straightforward answer given in this paper pointed to by #dreeves which makes the sampling uniform. The method (if I understand it clearly) is to
Generate n-1 distinct values from the range [1, 2, ... , M-1].
Sort the resulting vector
Add 0 and M as the first and last elements of the resulting vector.
Generate a new vector by computing xi - xi-1 where i = 1,2, ... n. That is, the new vector is made up of the differences between consecutive elements of the old vector.
Divide each element of the new vector by M. You have your uniform distribution!
I am curious to know if generating distinct random values and normalizing them to 1 by dividing by their sum will also produce a uniform distribution.
Get n random numbers, calculate their sum and normalize the sum to 1
by dividing each number with the sum.
Expanding on Kobi's answer, here's a Java function that does exactly that.
public static double[] getRandDistArray(int n) {
double randArray[] = new double[n];
double sum = 0;
// Generate n random numbers
for (int i = 0; i < randArray.length; i++) {
randArray[i] = Math.random();
sum += randArray[i];
}
// Normalize sum to 1
for (int i = 0; i < randArray.length; i++) {
randArray[i] /= sum;
}
return randArray;
}
In a test run, getRandDistArray(5) returned the following
[0.1796505603694718, 0.31518724882558813, 0.15226147256596428, 0.30954417535503603, 0.043356542883939767]
If you want to generate values from a normal distribution efficiently, try the Box Muller Transformation.
public static double[] array(int n){
double[] a = new double[n];
double flag = 0;
for(int i=0;i<n;i++){
a[i] = Math.random();
flag += a[i];
}
for(int i=0;i<n;i++) a[i] /= flag;
return a;
}
Here, at first a stores random numbers. And the flag will keep the sum all the numbers generated so that at the next for loop the numbers generated will be divided by the flag, which at the end the array will have random numbers in probability distribution.

Categories