Probability in Java

Probability in Java - java

I was curious to know, how do I implement probability in Java? For example, if the chances of a variable showing is 1/25, then how would I implement that? Or any other probability? Please point me in the general direction.

You'd use Random to generate a random number, then test it against a literal to match the probability you're trying to achieve.
So given:
boolean val = new Random().nextInt(25)==0;
val will have a 1/25 probability of being true (since nextInt() has an even probability of returning any number starting at 0 and up to, but not including, 25.)
You would of course have to import java.util.Random; as well.
As pointed out below, if you're getting more than one random number it'd be more efficient to reuse the Random object rather than recreating it all the time:
Random rand = new Random();
boolean val = rand.nextInt(25)==0;
..
boolean val2 = rand.nextInt(25)==0;

Generally you use a random number generator. Most of those return a number in the interval [0,1[ so you would then check whether that number is < 0.04 or not.
if( new Random().nextDouble() < 0.04 ) { //you might want to cache the Random instance
//we hit the 1/25 ( 4% ) case.
}
Or
if( Math.random() < 0.04 ) {
//we hit the 1/25 ( 4% ) case.
}
Note that there are multiple random number generators that have different properties, but for simple applications the Random class should be sufficient.
Edit: I changed the condition from <= to < because the upper boundary of the random number is exlusive, i.e. the largest returned value will still be < 1.0. Hence x <= 0.04 would actually be slightly more than a 4% chance, while x < 0.04 would be accurate (or as accurate as floating point math can be).

Since 1.7 it's better to use (in concurrent environment at least):
ThreadLocalRandom.current().nextInt(25) == 0
Javadoc
A random number generator isolated to the current thread. Like the global Random generator used by the Math class, a ThreadLocalRandom is initialized with an internally generated seed that may not otherwise be modified. When applicable, use of ThreadLocalRandom rather than shared Random objects in concurrent programs will typically encounter much less overhead and contention. Use of ThreadLocalRandom is particularly appropriate when multiple tasks (for example, each a ForkJoinTask) use random numbers in parallel in thread pools.
Usages of this class should typically be of the form: ThreadLocalRandom.current().nextX(...) (where X is Int, Long, etc). When all usages are of this form, it is never possible to accidently share a ThreadLocalRandom across multiple threads.
This class also provides additional commonly used bounded random generation methods.

Java has a class called java.util.Random which can generate random numbers. If you want something to happen with probability 1/25, simply generate a random number between 1 and 25 (or 0 and 24 inclusive) and check whether that number is equal to 1.
if(new java.util.Random().nextInt(25)==0){
//Do something.
}

Maybe you can implement this with generating random numbers.
Random rn = new Random();
double d = rn.nextDouble(); // random value in range 0.0 - 1.0
if(d<=0.04){
doSomeThing();
}

Related

How to use MersenneTwisterRNG random number generator

I am trying to implement a random number generator in my Java program. I was using Math.random() but that didn't seem to work very well. Then I tried using SecureRandom, but that took too long for my game. However, I came across this generator, the MersenneTwisterRNG random number generator. It seems to be what I want; fast, but still random.
However, I have not been writing in Java for very long, only 2 months, and I cannot make heads nor tails of the API. If anyone could help explain to me how to use this in my code, it would be appreciated. Or, if you happen to know of a simpler, but similar, random number generator, I would be interested in that as well. The API is here.

How to use the MersenneTwisterRNG API:
import java.util.Random;
import org.uncommons.maths.random.MersenneTwisterRNG;
This lets you access the classes using their short names.
Random rand = new MersenneTwisterRNG();
This creates a new MersenneTwisterRNG. We put it into a variable of type Random so that we can swap it out for another RNG easily if needed.
double x = rand.nextDouble();
This behaves the same as Math.random(), and returns a double-precision floating-point number between 0.0 and 1.0.
int n = rand.nextInt(10);
This generates a random number between 0 (inclusive) and 9 (inclusive), i.e. 0 <= n < 10. This is useful for a lot of integer algorithms.

Do an action with some probability in java

In Java, I am trying to do an action with a probability p. p is a float variable in my code. I came up with this way of doing it:
if( new Random().nextFloat() < p)
do action
I wanted to confirm if this is the correct way of doing it.

There is a TL;DR at the end.
From javadocs for nextFloat() (emphasis by me):
public float nextFloat()
Returns the next pseudorandom, uniformly distributed float value
between 0.0 and 1.0 from this random number generator's sequence.
If you understand what uniform distribution is, knowing this about nextFloat() is going to be enough for you. Yet I am going to explain a little about uniform distribution.
In uniform distribution, U(a,b) each number in the interval [a,b], and also all sub-intervals of the same length within [a,b] are equally probable, i.e. they have equal probability.
In the figure, on the left is the PDF, and on the right the CDF for uniform distribution.
For uniform distribution, the probability of getting a number less than or equal to n, P(x <= n) from the distribution is equal to the number itself (look at the right graph, which is cumulative distribution function for uniform distribution). That is, P(x <= 0.5) = 0.5, P(x <= 0.9) = 0.9. You can learn more about uniform distribution from any good statistics book, or some googling.
Fitting to your situation:
Now, probability of getting a number less than or equal to p generated using nextFloat() is equal to p, as nextFloat() returns uniformly distributed number. So, to make an action happen with a probability equal to p all you have to do is:
if (condition that is true with a probability p) {
do action
}
From what is discussed about nextFloat() and uniform distribution, it turns out to be:
if(randObj.nextFloat() <= p) {
do action
}
Conclusion:
What you did is almost the right way to do what you intended. Just adding the equal sign after < is all that's needed, and it doesn't hurt much to leave out the equal sign either!
P.S.: You don't need to create a new Random object each time in your conditional, you can create one, say randObj before your loop, and then invoke its nextFloat() method whenever you want to generate a random number, as I have done in my code.
Comment by pjs:
Take a look at the comment on the question by pjs, which is very important and well said. I quote:
Do not create a new Random object each time, that's not how PRNGs are
meant to be used! A single Random object provides a sequence of values
with good distributional properties. Multiple Random objects created
in rapid succession are 1) computationally expensive, and 2) may have
highly correlated initial states, thus producing highly correlated
outcomes. Random actually works best when you create a single instance
per program and keep drawing from it, unless you really really know
what you're doing and have specific reasons for using correlation
induction strategies.
TL;DR
What you did is almost the right way to do it. Just adding the equal sign after < (to make it <=) is all that's needed, and it doesn't hurt much to leave out the equal sign either!

Yes. That is correct (from a pure probability perspective). Random().nextFloat() will generate a number between 0.0 and 1.0 exclusive. So as long as your probability is as a float in the range 0.0 and 1.0, this is the correct way of doing it.
You can read more of the exact nextFloat() documentation here.

Select one of three numbers with equal probability?

I would like to select one of three numbers with equal probability (33.3%). Can I use the Random class to achieve this?
What would be the percentage of each number being selected after 100x running? Would it be evenly 33.3% each?

Use the Random.nextInt(n) method to select a number between 0 and 2. Use this to choose one of your three values.
int index = Random.nextInt(3);
int selectedValue = myOptions[index];
The value returned by that method is uniformly distributed. So if you were to repeat this process infinitely, the probability of each number being chosen would approach 1/3.
From the docs:
All n possible int values are produced with (approximately) equal probability

You can try somethin like this for 1/25 probabilty
if(new java.util.Random().nextInt(25)==0){
//Do something.
}

Random rand = new Random();
int number = rand.nextInt(3) + 1;
Note that there is no such thing as true random. In computer science, we use so called pseudo-randomness. For most everyday purposes that's close enough to random that the difference isn't noticable. You say you'll be running it 100 times, then yes. For all intents and purposes it will be 33.3333...%.

You need to decide whether entropy is more important than 'randomnes'. Randomness will not give you a perfect 33.3% but the final outcome will be unpredictable. Other algorithms can give you a perfect 33.3% but the outcome can be predictable, and even the same in some cases each time the process is ran.

it is possible for exact random ness for only a two value range.
....after random statemeants...
if(rand=>1&&rand<=1.4999999999999999999999....)
{
rand=1;
}
else
{
rand=2
}

Random.nextInt(int) is [slightly] biased

Namely, it will never generate more than 16 even numbers in a row with some specific upperBound parameters:
Random random = new Random();
int c = 0;
int max = 17;
int upperBound = 18;
while (c <= max) {
int nextInt = random.nextInt(upperBound);
boolean even = nextInt % 2 == 0;
if (even) {
c++;
} else {
c = 0;
}
}
In this example the code will loop forever, while when upperBound is, for example, 16, it terminates quickly.
What can be the reason of this behavior? There are some notes in the method's javadoc, but I failed to understand them.
UPD1: The code seems to terminate with odd upper bounds, but may stuck with even ones
UPD2:
I modified the code to capture the statistics of c as suggested in the comments:
Random random = new Random();
int c = 0;
long trials = 1 << 58;
int max = 20;
int[] stat = new int[max + 1];
while (trials > 0) {
while (c <= max && trials > 0) {
int nextInt = random.nextInt(18);
boolean even = nextInt % 2 == 0;
if (even) {
c++;
} else {
stat[c] = stat[c] + 1;
c = 0;
}
trials--;
}
}
System.out.println(Arrays.toString(stat));
Now it tries to reach 20 evens in the row - to get better statistics, and the upperBound is still 18.
The results turned out to be more than surprising:
[16776448, 8386560, 4195328, 2104576, 1044736,
518144, 264704, 132096, 68864, 29952, 15104,
12032, 1792, 3072, 256, 512, 0, 256, 0, 0]
At first it decreases as expected by the factor of 2, but note the last line! Here it goes crazy and the captured statistics seem to be completely weird.
Here is a bar plot in log scale:
How c gets the value 17 256 times is yet another mystery

http://docs.oracle.com/javase/6/docs/api/java/util/Random.html:
An instance of this class is used to generate a stream of
pseudorandom numbers. The class uses a 48-bit seed, which is modified
using a linear congruential formula. (See Donald Knuth, The Art of
Computer Programming, Volume 3, Section 3.2.1.)
If two instances of Random are created with the same seed, and the
same sequence of method calls is made for each, they will generate and
return identical sequences of numbers. [...]
It is a pseudo-random number generator. This means that you are not actually rolling a dice but rather use a formula to calculate the next "random" value based on the current random value. To creat the illusion of randomisation a seed is used. The seed is the first value used with the formula to generate the random value.
Apparently javas random implementation (the "formula"), does not generate more than 16 even numbers in a row.
This behaviour is the reason why the seed is usually initialized with the time. Deepending on when you start your program you will get different results.
The benefits of this approach are that you can generate repeatable results. If you have a game generating "random" maps, you can remember the seed to regenerate the same map if you want to play it again, for instance.
For true random numbers some operating systems provide special devices that generate "randomness" from external events like mousemovements or network traffic. However i do not know how to tap into those with java.
From the Java doc for secureRandom:
Many SecureRandom implementations are in the form of a pseudo-random
number generator (PRNG), which means they use a deterministic
algorithm to produce a pseudo-random sequence from a true random seed.
Other implementations may produce true random numbers, and yet others
may use a combination of both techniques.
Note that secureRandom does NOT guarantee true random numbers either.
Why changing the seed does not help
Lets assume random numbers would only have the range 0-7.
Now we use the following formula to generate the next "random" number:
next = (current + 3) % 8
the sequence becomes 0 3 6 1 4 7 2 5.
If you now take the seed 3 all you do is to change the starting point.
In this simple implementation that only uses the previous value, every value may occur only once before the sequence wraps arround and starts again. Otherwise there would be an unreachable part.
E.g. imagine the sequence 0 3 6 1 3 4 7 2 5. The numbers 0,4,7,2 and 5 would never be generated more than once(deepending on the seed they might be generated never), since once the sequence loops 3,6,1,3,6,1,... .
Simplified pseudo random number generators can be thought of a permutation of all numbers in the range and you use the seed as a starting point. If they are more advanced you would have to replace the permutation with a list in which the same numbers might occur multiple times.
More complex generators can have an internal state, allowing the same number to occur several times in the sequence, since the state lets the generator know where to continue.

The implementation of Random uses a simple linear congruential formula. Such formulae have a natural periodicity and all sorts of non-random patterns in the sequence they generate.
What you are seeing is an artefact of one of these patterns ... nothing deliberate. It is not an example of bias. Rather it is an example of auto-correlation.
If you need better (more "random") numbers, then you need to use SecureRandom rather than Random.
And the answer to "why was it implemented that way is" ... performance. A call to Random.nextInt can be completed in tens or hundreds of clock cycles. A call to SecureRandom is likely to be at least 2 orders of magnitude slower, possibly more.

For portability, Java specifies that implementations must use the inferior LCG method for java.util.Random. This method is completely unacceptable for any serious use of random numbers like complex simulations or Monte Carlo methods. Use an add-on library with a better PRNG algorithm, like Marsaglia's MWC or KISS. Mersenne Twister and Lagged Fibonacci Generators are often OK as well.
I'm sure there are Java libraries for these algorithms. I have a C library with Java bindings if that will work for you: ojrandlib.

random number with seed

Reference: link text
i cannot understand the following line , can anybody provide me some example for the below statement?
If two instances of Random are created with the same seed, and the same sequence of method calls is made for each, they will generate and return identical sequences of numbers

Since you asked for an example:
import java.util.Random;
public class RandomTest {
public static void main(String[] s) {
Random rnd1 = new Random(42);
Random rnd2 = new Random(42);
System.out.println(rnd1.nextInt(100)+" - "+rnd2.nextInt(100));
System.out.println(rnd1.nextInt()+" - "+rnd2.nextInt());
System.out.println(rnd1.nextDouble()+" - "+rnd2.nextDouble());
System.out.println(rnd1.nextLong()+" - "+rnd2.nextLong());
}
}
Both Random instances will always have the same output, no matter how often you run it, no matter what platform or what Java version you use:
30 - 30
234785527 - 234785527
0.6832234717598454 - 0.6832234717598454
5694868678511409995 - 5694868678511409995

The random generator is deterministic. Given the same input to Random and the same usage of the methods in Random, the sequence of pseudo-random numbers returned to your program will be the same even in different runs on different machines.
This is why it is pseudo-random - the numbers returned behave statistically like random numbers except they can be reliably predicted. True random numbers are unpredictable.

The Random class basically is a Psuedorandom Number Generator (also known as Deterministic random bit generator) that generates a sequence of numbers that approximates the properties of random numbers. It's not generally random but deterministic as it can be determined by small random states in the generator (such as seed). Because of the deterministic nature, you can generate identical result if you the sequence of methods and seeds are identical on 2 generators.

The numbers are not really random, given the same starting conditions (the seed) and the same sequence of operations, the same sequence of numbers will be generated. This is why it would not be a good iea to use the basic Random class as part of any cryptograhic or security related code since it may be possible for an attacker to figure out which sequnce is being generated and predict future numbers.
For a random number generator that emits non-deterministic values, take a look at SecureRandom.
See Random number generation, Computational methods on wikipedia for more info.

This means that when you create the Random object (e.g. at the start of your program), you will probably want to start with a new seed. Mostly people choose some time related value, such as the number of ticks.
The fact that the number sequences are the same given the same seed is actually very convenient if you want to debug your program: make sure you log the seed value and if something is wrong you can restart the program in the debugger using that same seed value. This means you can replay the scenario exactly. This would be impossible if you would (could) use a true random number generator.

With the same seed value, separate instances of Random will return/generate the same sequence of random numbers; more on this here:
http://www.particle.kth.se/~lindsey/JavaCourse/Book/Part1/Tech/Chapter04/javaRandNums.html
Ruby Example:
class LCG; def initialize(seed=Time.now.to_i, a=2416, b=374441, m=1771075); #x, #a, #b, #m = seed % m, a, b, m; end; def next(); #x = (#a * #x + #b) % #m; end; end
irb(main):004:0> time = Time.now.to_i
=> 1282908389
irb(main):005:0> r = LCG.new(time)
=> #<LCG:0x0000010094f578 #x=650089, #a=2416, #b=374441, #m=1771075>
irb(main):006:0> r.next
=> 45940
irb(main):007:0> r.next
=> 1558831
irb(main):008:0> r.next
=> 1204687
irb(main):009:0> f = LCG.new(time)
=> #<LCG:0x0000010084cb28 #x=650089, #a=2416, #b=374441, #m=1771075>
irb(main):010:0> f.next
=> 45940
irb(main):011:0> f.next
=> 1558831
irb(main):012:0> f.next
=> 1204687
Based on the values a/b/m, the result will be the same for a given seed. This can be used to generate the same "random" number in two places and both sides can depend on getting the same result. This can be useful for encryption; although obviously, this algorithm isn't cryptographically secure.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.