generate secure random bits - java

public static void main(String[] args)
{
Random ranGen = new SecureRandom();
ranGen.setSeed(0);
int randomNumber = ranGen.nextInt(2);
System.out.print(randomNumber);
}
Is the above code a good way to either produce a truly random and secure/unbiased 0 or 1 ?

No, it isn't (a good way to either produce a truly random and secure/unbiased 0 or 1).
new SecureRandom() is OK, but setting the seed directly after isn't. Oracle's implementation of "SHA1PRNG" - which is normally the default - will replace the initial seed by the one given if setSeed() is called before any entropy is retrieved from the random number generator.
If it is not seeded the SecureRandom implementation will be seeded using an entropy source of the operating system. This should be relatively safe. Often the operating system is the only part that has access to real hardware devices so it is better positioned to retrieve entropy than the JVM. You can call setSeed(), but you should only call it after a call to nextBytes(), requesting at least one byte. Afterwards you can add some seed to the pool.
Finally, calling nextInt(2) is OK, but not very efficient. It will request 32 bits, and then toss away the 31 bits it doesn't require. Requesting random bytes and extracting the bits from the array will be more efficient (if efficiency is required, don't over-optimize). Calling nextBoolean() will probably just toss away 7 bits, so it is at least 4 times more efficient and much more concise.

Related

UUID.randomUUID() vs SecureRandom

I am trying to understand the advantages of using UUID.randomUUID() over SecureRandom generator as the former uses securerandom internally.
Well, the source code shows UUID.randomUUID uses SecureRandom.
public static UUID [More ...] randomUUID() {
SecureRandom ng = numberGenerator;
if (ng == null) {
numberGenerator = ng = new SecureRandom();
}
byte[] randomBytes = new byte[16];
ng.nextBytes(randomBytes);
randomBytes[6] &= 0x0f; /* clear version */
randomBytes[6] |= 0x40; /* set to version 4 */
randomBytes[8] &= 0x3f; /* clear variant */
randomBytes[8] |= 0x80; /* set to IETF variant */
return new UUID(randomBytes);
}
As you can see, you can use either, but in a secure UUID you have 6 non-random bits, which can be considered a disadvantage if you are picky.
Random numbers have a random chance of being repeated. The lower the randomness (unless there is some co-ordination), the greater the chance of producing the same number twice.
https://en.wikipedia.org/wiki/Birthday_problem
As you produce more random numbers the chance of the same number being repeated increases as every id must be different to every other id.
SecureRandom allows you to choose how many bit of randomness you want. Make it too small and there is a good chance they will be repeated. You can get duplicate random 32-bit id in a fraction of a second.
UUID sets the standard at 128 bits (or as uoyilmaz points out, 122 bits are random) This is enough for most use cases. However if you want a random String, I would be tempted to use more bits and/or a higher base than 16. Java for example support base 36 and 64 which means you can have shorter ids, or more randomness for the same length ID.
Note: UUID format has multiple - in it's dump though I don't see the value of them, they just make the string longer.
Thanks for all the provided technical answers. I, myself, was also baffled by the difference between the two which led me here. But then, a thought dawned on me: If you only call the function once, then there is no difference as both method generates a number that could not be pre-calculated. But if call the function several times, then they differ here because a statistical normal distribution is a property of a random number generator whereas this is not a property of a UUID. UUID strives for uniqueness and in fact it derives the provided number using your computer's MAC hardware address, the current epoch seconds etc. And eventually, if you for-loop call the UUID values it will not be statistically normally distributed.
The UUID is not a random number: it is a universal unique ID. You can be sure that no one can generate the same hexadecimal string.
A random number is another story: it is not an hexadecimal string and it is not universally unique.
A more efficient and completed generator of UUIDs is provided by this library.

Unique random - srand in java ? (like in c++)

Currently I'm using a
Random rand = new Random();
myNumber = rand.nextInt((100-0)+1)+0;
and it's a pseudo-random, because I always get the same sequence of numbers.
I remember in c++ you could just use the ctime and use srand to make it unique. How do you do it in java ?
If you require a unique random in Java, use SecureRandom. This class provides a cryptographically strong random number generator (RNG), meaning it can be used when creating certificates and this class is slow compared to the other solutions.
If you require a predictable random that you can reset, use a Random, where you call .setSeed(number) to seed it to make the values predictable from that point.
If you want a pseudo-random that is random every time you start the program, use a normal Random. A random instance is seed by default by some hash of the current time.
For the best randomness in every solution, it is important to RE-USE the random instance. If this isn't done, most of the output it gives will be similar over the same timespan, and at the case of a thrown away SecureRandom, it will spend a lot of time recreating the new random.

Consistent random numbers across versions and platforms

I need/want to get random (well, not entirely) numbers to use for password generation.
What I do: Currently I am generating them with SecureRandom.
I am obtaining the object with
SecureRandom sec = SecureRandom.getInstance("SHA1PRNG", "SUN");
and then seeding it like this
sec.setSeed(seed);
Target: A (preferably fast) way to create random numbers, which are cryptographically at least a safe as the SHA1PRNG SecureRandom implementation. These need to be the same on different versions of the JRE and Android.
EDIT: The seed is generated from user input.
Problem: With SecureRandom.getInstance("SHA1PRNG", "SUN"); it fails like this:
java.security.NoSuchProviderException: SUN. Omitting , "SUN" produces random numbers, but those are different than the default (JRE 7) numbers.
Question: How can I achieve my Target?
You don't want it to be predictable: I want, because I need the predictability so that the same preconditions result in the same output. If they are not the same, its impossible hard to do what the user expects from the application.
EDIT: By predictable I mean that, when knowing a single byte (or a hundred) you should not be able to predict the next, but when you know the seed, you should be able to predict the first (and all others). Maybe another word is reproducible.
If anyone knows of a more intuitive way, please tell me!
I ended up isolating the Sha1Prng from the sun sources which guarantees reproducibility on all versions of Java and android. I needed to drop some important methods to ensure compatibility with android, as android does not have access to nio classes...
I recommend using UUID.randomUUID(), then splitting it into longs using getLeastSignificantBits() and getMostSignificantBits()
If you want predictable, they aren't random. That breaks your "Target" requirement of being "safe" and devolves into a simple shared secret between two servers.
You can get something that looks sort of random but is predicatable by using the characteristics of prime integers where you build a set of integers by starting with I (some specific integer) and add the first prime number and then modulo by the 2nd prime number. (In truth the first and second numbers only have to be relatively prime--meaning they have no common prime factors--not counting 1, in case you call that a factor.
If you repeat the process of adding and doing the modulo, you will get a set of numbers that you can repeatably reproduce and they are ordered in the sense that taking any member of the set, adding the first prime and doing the modulo by the 2nd prime, you will always get the same result.
Finally, if the 1st prime number is large relative to the second one, the sequence is not easily predictable by humans and seems sort of random.
For example, 1st prime = 7, 2nd prime = 5 (Note that this shows how it works but is not useful in real life)
Start with 2. Add 7 to get 9. Modulo 5 to get 4.
4 plus 7 = 11. Modulo 5 = 1.
Sequence is 2, 4, 1, 3, 0 and then it repeats.
Now for real life generation of numbers that seem random. The relatively prime numbers are 91193 and 65536. (I chose the 2nd one because it is a power of 2 so all modulo-ed values can fit in 16 bits.)
int first = 91193;
int modByLogicalAnd = 0xFFFF;
int nonRandomNumber = 2345; // Use something else
for (int i = 0; i < 1000 ; ++i) {
nonRandomNumber += first;
nonRandomNumber &= modByLogicalAnd;
// print it here
}
Each iteration generates 2 bytes of sort of random numbers. You can pack several of them into a buffer if you need larger random "strings".
And they are repeatable. Your user can pick the starting point and you can use any prime you want (or, in fact, any number without 2 as a factor).
BTW - Using a power of 2 as the 2nd number makes it more predictable.
Ignoring RNGs that use some physical input (random clock bits, electrical noise, etc) all software RNGs are predicable, given the same starting conditions. They are, after all, (hopefully) deterministic computer programs.
There are some algorithms that intentionally include the physical input (by, eg, sampling the computer clock occasionally) in attempt to prevent predictability, but those are (to my knowledge) the exception.
So any "conventional" RNG, given the same seed and implemented to the same specification, should produce the same sequence of "random" numbers. (This is why a computer RNG is more properly called a "pseudo-random number generator".)
The fact that an RNG can be seeded with a previously-used seed and reproduce a "known" sequence of numbers does not make the RNG any less secure than one where your are somehow prevented from seeding it (though it may be less secure than the fancy algorithms that reseed themselves at intervals). And the ability to do this -- to reproduce the same sequence again and again is not only extraordinarily useful in testing, it has some "real life" applications in encryption and other security applications. (In fact, an encryption algorithm is, in essence, simply a reproducible random number generator.)

Why are initial random numbers similar when using similar seeds?

I discovered something strange with the generation of random numbers using Java's Random class.
Basically, if you create multiple Random objects using close seeds (for example between 1 and 1000) the first value generated by each generator will be almost the same, but the next values looks fine (i didn't search further).
Here are the two first generated doubles with seeds from 0 to 9 :
0 0.730967787376657 0.24053641567148587
1 0.7308781907032909 0.41008081149220166
2 0.7311469360199058 0.9014476240300544
3 0.731057369148862 0.07099203475193139
4 0.7306094602878371 0.9187140138555101
5 0.730519863614471 0.08825840967622589
6 0.7307886238322471 0.5796252073129174
7 0.7306990420600421 0.7491696031336331
8 0.7302511331990172 0.5968915822372118
9 0.7301615514268123 0.7664359929590888
And from 991 to 1000 :
991 0.7142160704801332 0.9453385235522973
992 0.7109015598097105 0.21848118381994108
993 0.7108119780375055 0.38802559454181795
994 0.7110807233541204 0.8793923921785096
995 0.7109911564830766 0.048936787999225295
996 0.7105432327208906 0.896658767102804
997 0.7104536509486856 0.0662031629235198
998 0.7107223962653005 0.5575699754613725
999 0.7106328293942568 0.7271143712820883
1000 0.7101849056320707 0.574836350385667
And here is a figure showing the first value generated with seeds from 0 to 100,000.
First random double generated based on the seed :
I searched for information about this, but I didn't see anything referring to this precise problem. I know that there is many issues with LCGs algorithms, but I didn't know about this one, and I was wondering if this was a known issue.
And also, do you know if this problem only for the first value (or first few values), or if it is more general and using close seeds should be avoided?
Thanks.
You'd be best served by downloading and reading the Random source, as well as some papers on pseudo-random generators, but here are some of the relevant parts of the source. To begin with, there are three constant parameters that control the algorithm:
private final static long multiplier = 0x5DEECE66DL;
private final static long addend = 0xBL;
private final static long mask = (1L << 48) - 1;
The multiplier works out to approximately 2^34 and change, the mask 2^48 - 1, and the addend is pretty close to 0 for this analysis.
When you create a Random with a seed, the constructor calls setSeed:
synchronized public void setSeed(long seed) {
seed = (seed ^ multiplier) & mask;
this.seed.set(seed);
haveNextNextGaussian = false;
}
You're providing a seed pretty close to zero, so initial seed value that gets set is dominated by multiplier when the two are OR'ed together. In all your test cases with seeds close to zero, the seed that is used internally is roughly 2^34; but it's easy to see that even if you provided very large seed numbers, similar user-provided seeds will yield similar internal seeds.
The final piece is the next(int) method, which actually generates a random integer of the requested length based on the current seed, and then updates the seed:
protected int next(int bits) {
long oldseed, nextseed;
AtomicLong seed = this.seed;
do {
oldseed = seed.get();
nextseed = (oldseed * multiplier + addend) & mask;
} while (!seed.compareAndSet(oldseed, nextseed));
return (int)(nextseed >>> (48 - bits));
}
This is called a 'linear congruential' pseudo-random generator, meaning that it generates each successive seed by multiplying the current seed by a constant multiplier and then adding a constant addend (and then masking to take the lower 48 bits, in this case). The quality of the generator is determined by the choice of multiplier and addend, but the ouput from all such generators can be easily predicted based on the current input and has a set period before it repeats itself (hence the recommendation not to use them in sensitive applications).
The reason you're seeing similar initial output from nextDouble given similar seeds is that, because the computation of the next integer only involves a multiplication and addition, the magnitude of the next integer is not much affected by differences in the lower bits. Calculation of the next double involves computing a large integer based on the seed and dividing it by another (constant) large integer, and the magnitude of the result is mostly affected by the magnitude of the integer.
Repeated calculations of the next seed will magnify the differences in the lower bits of the seed because of the repeated multiplication by the constant multiplier, and because the 48-bit mask throws out the highest bits each time, until eventually you see what looks like an even spread.
I wouldn't have called this an "issue".
And also, do you know if this problem only for the first value (or first few values), or if it is more general and using close seeds should be avoided?
Correlation patterns between successive numbers is a common problem with non-crypto PRNGs, and this is just one manifestation. The correlation (strictly auto-correlation) is inherent in the mathematics underlying the algorithm(s). If you want to understand that, you should probably start by reading the relevant part of Knuth's Art of Computer Programming Chapter 3.
If you need non-predictability you should use a (true) random seed for Random ... or let the system pick a "pretty random" one for you; e.g. using the no-args constructor. Or better still, use a real random number source or a crypto-quality PRNG instead of Random.
For the record:
The javadoc (Java 7) does not specify how Random() seeds itself.
The implementation of Random() on Java 7 for Linux, is seeded from the nanosecond clock, XORed with a 'uniquifier' sequence. The 'uniquifier' sequence is LCG which uses different multiplier, and whose state is static. This is intended to avoid auto-correlation of the seeds ...
This is a fairly typical behaviour for pseudo-random seeds - they aren't required to provide completely different random sequences, they only provide a guarantee that you can get the same sequence again if you use the same seed.
The behaviour happens because of the mathematical form of the PRNG - the Java one uses a linear congruential generator, so you are just seeing the results running the seed through one round of the linear congruential generator. This isn't enough to completely mix up all the bit patterns, hence you see similar results for similar seeds.
Your best strategy is probably just to use very different seeds - one option would be to obtain these by hashing the seed values that you are currently using.
By making random seeds (for instance, using some mathematical functions on System.currentTimeMillis() or System.nanoTime() for seed generation) you can get better random result. Also can look at here for more information

Uniform distribution with Random

I know if i use the Random generator from Java, generating numbers with nextInt, the numbers will be uniformly distributed. But what happens if I use 2 instances of Random, generating numbers with the both Random classes. The numbers will be uniformly distributed or not?
The numbers generated by each Random instance will be uniformly distributed, so if you combine the sequences of random numbers generated by both Random instances, they should be uniformly distributed too.
Note that even if the resulting distribution is uniform, you might want to pay attention to the seeds to avoid correlation between the output of the two generators. If you use the default no-arg constructor, the seeds should already be different. From the source code of java.util.Random:
private static volatile long seedUniquifier = 8682522807148012L;
public Random() { this(++seedUniquifier + System.nanoTime()); }
If you are setting the seed explicitly (by using the Random(long seed) constructor, or calling setSeed(long seed)), you'll need to take care of this yourself. One possible approach is to use a random number generator to produce the seeds for all other generators.
Well, if you seed both Random instances with the same value, you will definitely not get quality discrete uniform distribution. Consider the most basic case, which literally prints the exact same number twice (doesn't get much less random than that ...):
public class RngTest2 {
public static void main(String[] args) throws Exception {
long currentTime = System.currentTimeMillis();
Random r1 = new Random(currentTime);
Random r2 = new Random(currentTime);
System.out.println(r1.nextInt());
System.out.println(r2.nextInt());
}
}
But that's just a single iteration. What happens if we start cranking up the sample size?
Here is a scatter plot of a distribution from running two same-seeded RNGs side-by-side to generate 2000 numbers total:
And here is a distribution of running a single RNG to generate 2000 numbers total:
It seems pretty clear which approach produced higher quality discrete uniform distribution over this finite set.
Now almost everyone knows that seeding two RNGs with the same seed is a bad idea if you're looking for high quality randomness. But this case does make you stop and think: we have created a scenario where each RNG is independently emitting fairly high quality randomness, but when their output is combined it is notably lower in quality (less discrete.)

Categories