What is the best and fastest algorithm to generate cryptographically secure PRNG? My requirements is as follows.
In each run, I need to generate few millions of numbers with length of 10.
It should not allow one to predict future iterations by observing some number of iterations.
Numbers should be unique. No duplicates allowed.
Speed is also important since it will generate millions of numbers in each iteration.
I checked Mersenne Twister algorithm but it seems it is not cryptographically secure. They said, it could be predicted by checking 624 iterations.
P.S. It would be better if there is a Java Implementation.
Your unique requirement means that you cannot use any sort of RNG (my earlier comment was wrong), since random numbers will include repeats. Instead, use a cipher and encrypt the numbers: 0, 1, 2, 3, ... which will give guaranteed unique results. Since ciphers are reversible, each cyphertext decrypts back to the original plaintext, so the cyphertexts are a permutation of the plaintexts.
You also want numbers of "length ten". I assume this means ten decimal digits, [1,000,000,000 .. 9,999,999,999]. That means you need a cipher which works in the range [0 .. 8,999,999,999], and just add 1e9 to the output.
That is more complex. Either use the Hasty Pudding cipher, which can be set for any range of numbers you want, or roll your own Feistel cipher with its block size set to the next higher power of 2. If a number is out of range then re-encrypt it until it is in range. The Feistel option will be faster, but less secure. You can make it more secure by increasing the number of rounds at the cost of making it slower.
Use Linear feedback shift register. This algorithm is fast and secure.
Build register based on primitive characteristic polynomial of degree n. This will give the sequence of 2^n - 1 unique numbers. After that the sequence will repeat. Do not seed with zero. Some well-known stream cipher algorithms are based on LFSR, so you can extract and investigate implementation. For example LILI-128 (but its reference implementation in C)
Related
In a project I'm working on, I need to generate 16 character long unique IDs, consisting of 10 numbers plus 26 uppercase letters (only uppercase). They must be guaranteed to be universally unique, with zero chance of a repeat ever.
The IDs are not stored forever. An ID is thrown out of the database after a period of time and a new unique ID must be generated. The IDs can never repeat with the thrown out ones either.
So randomly generating 16 digits and checking against a list of previously generated IDs is not an option because there is no comprehensive list of previous IDs. Also, UUID will not work because the IDs must be 16 digits in length.
Right now I'm using 16-Digit Unique IDs, that are guaranteed to be universally unique every time they're generated (I'm using timestamps to generate them plus unique server ID). However, I need the IDs to be difficult to predict, and using timestamps makes them easy to predict.
So what I need to do is map the 16 digit numeric IDs that I have into the larger range of 10 digits + 26 letters without losing uniqueness. I need some sort of hashing function that maps from a smaller range to a larger range, guaranteeing a one-to-one mapping so that the unique IDs are guaranteed to stay unique after being mapped.
I have searched and so far have not found any hashing or mapping functions that are guaranteed to be collision-free, but one must exist if I'm mapping to a larger space. Any suggestions are appreciated.
Brandon Staggs wrote a good article on Implementing a Partial Serial Number Verification System. The examples are written in Delphi, but could be converted to other languages.
EDIT: This is an updated answer, as I misread the constraints on the final ID.
Here is a possible solution.
Let set:
UID16 = 16-digit unique ID
LUID = 16-symbol UID (using digits+letters)
SECRET = a secret string
HASH = some hash of SECRET+UID16
Now, you can compute:
LUID = BASE36(UID16) + SUBSTR(BASE36(HASH), 0, 5)
BASE36(UID16) will produce a 11-character string (because 16 / log10(36) ~= 10.28)
It is guaranteed to be unique because the original UID16 is fully included in the final ID. If you happen to get a hash collision with two different UID16, you'll still have two distinct LUID.
Yet, it is difficult to predict because the 5 other symbols are based on a non-predictable hash.
NB: you'll only get log2(36^5) ~= 26 bits of entropy on the hash part, which may or may not be enough depending on your security requirements. The less predictable the original UID16, the better.
One general solution to your problem is encryption. Because encryption is reversible it is always a one-to-one mapping from plaintext to cyphertext. If you encrypt the numbers [0, 1, 2, 3, ...] then you are guaranteed that the resulting cyphertexts are also unique, as long as you keep the same key, do not repeat a number or overflow the allowed size. You just need to keep track of the next number to encrypt, incrementing as needed, and check that it never overflows.
The problem then reduces to the size (in bits) of the encryption and how to present it as text. You say: "10 numbers plus 26 uppercase letters (only uppercase)." That is similar to Base32 encoding, which uses the digits 2, 3, 4, 5, 6, 7 and 26 letters. Not exactly what you require, but perhaps close enough and available off the shelf. 16 characters at 5 bits per Base32 character is 80 bits. You could use an 80 bit block cipher and convert the output to Base32. Either roll your own simple Feistel cipher or use Hasty Pudding cipher, which can be set for any block size. Do not roll your own if there is a major security requirement here. Your own Feistel cipher will give you uniqueness and obfuscation, not security. Hasty Pudding gives security as well.
If you really do need all 10 digits and 26 letters, then you are looking at a number in base 36. Work out the required bit size for 36^16 and proceed as before. Convert the cyphertext bits to a number expressed in base 36.
If you write your own cipher then it appears that you do not need the decryption function, which will save a little work.
You want to map from a space consisting of 1016 distinct values to one with 3616 values.
The ratio of the sizes of these two spaces is ~795,866,110.
Use BigDecimal and multiply each input value by the ratio to distribute the input keys equally over the output space. Then base-36 encode the resulting value.
Here's a sample of 16-digit values consisting of 11 digits "timestamp" and 5 digits server ID encoded using the above scheme.
Decimal ID Base-36 Encoding
---------------- ----------------
4156333000101044 -> EYNSC8L1QJD7MJDK
4156333000201044 -> EYNSC8LTY4Y8Y7A0
4156333000301044 -> EYNSC8MM5QJA9V6G
4156333000401044 -> EYNSC8NEDC4BLJ2W
4156333000501044 -> EYNSC8O6KXPCX6ZC
4156333000601044 -> EYNSC8OYSJAE8UVS
4156333000701044 -> EYNSC8PR04VFKIS8
4156333000801044 -> EYNSC8QJ7QGGW6OO
The first 11 digits form the "timestamp" and I calculated the result for a series incremented by 1; the last five digits are an arbitrary "server ID", in this case 01044.
Can you use BigInteger.isProbablePrime() to generate cryptographically secure primes? What certainty is necessary for them to be "secure"?
I do not hold a degree in crypto, so take this with a grain of salt.
You have two major areas of concern here:
Your primes need to be unpredictably random. This means that you need to use a source such as SecureRandom to generate your primes. No matter how sure of your primality, if they are predictable, the entire cryptosystem fails to meet its goal. If you are using the BigInteger(int bitLength, int certainty, Random rnd) constructor, you can pass in your SecureRandom as it subclasses Random.
Your potential primes need to be reasonably certain of being primes (I'm assuming that you are using an algorithm that relies on the hardness of factoring). If you get a probable prime, but an attacker can, with a good probability, factor it within 5 minutes because it had a factor that never got noticed by the primality test you ran, you are somewhat out of luck with your algorithm. Rabin-Miller is generally used, and this answer states that a certainty of 15 is sufficient for 32-bit integers. A value up to 40 is recommended, and anything beyond that is meaningless.
this is the way i generated a secure BigInteger for my cryptographic application.
Here's my code:
BigInteger b = new BigInteger(25, new SecureRandom());
Since you also need it for a crypto application, in my opinion, getting a BigInteger is this way is right.
Note: Remember that SecureRandom objects are preformance-wise costly. So You should not initialize them many times.
After reading comments, it worked out further
Here's a way which assures you more certainity of getting a prime number.
BigInteger b =BigInteger.probablePrime(25, new SecureRandom(););
As #hexafraction says, you need to use SecureRandom() to generate a cryptographic quality random number. The Javadoc says that the generated prime is 2^-100 secure. If you want more security (say to 2^-128 for AES equivalent security) then run more iterations of the Miller-Rabin test on it. Each iteration gives you an extra 2^-2 security, so fourteen iterations would get you to 2^-128.
I am looking for a function (hash function? not sure) that maps a set of (unique) numbers to another set of (unique) numbers. I had a look at perfect hashing -and the fact that it's impossible to do :( - and it seems quite close to what I am after.
In detail, I want the following:
Map any given 12 digit number to another number, 12 digit or more (I don't care about the resulting size but it must fit in a Java long).
Must be able to guarantee that the same 12 digit number maps to the same number every time.
Must be able to guarantee that a different 12 digit number always maps to a different number, i.e. there are no collisions in the result set.
Must be able to guarantee that the function is one way, which (in my mind anyway) means that you cannot compute the number you started from given the result of the function.
Something like minimal hashing function would not work in my case, as each number has to be computed on a different computer, meaning that the function itself has to guarantee these features without checking for collisions in the results and or having any centralized control of the result set.
An example of such a function would just be a simple:
take number, add 1, output number.
The only problem with that is that you can easily get first number just by subtracting one. I want it to be very difficult, or preferably impossible to get the previous number back.
Any thoughts ?
Translating the function from math to java I don't mind doing on my own. Unless you can suggest an already existing java library.
You are looking for Format-preserving encription algorithms.
It looks a cryptography problem for me.
If you want to map a String to other String and must be able to guarantee that is one way and has no collisions, you need encrypt the input String.
For example you can use DES.
If the output must be digits you can interpret the bytes of the output as hexadecimal and then convert to base 10.
What you want is a cipher. A good old fashioned symmetric cipher. Like AES.
Imagine for a second that instead of 12 digits, your numbers were 128 bits long. Let's say you set up an AES cipher with a key of your choosing, and use it to encrypt numbers. What is the outcome?
You will map any given 128-bit number to another number, of exactly 128 bits (AES's block size is 128 bits)
You can guarantee that the same 128-bit number maps to the same number every time (encryption is deterministic)
You can guarantee that two different 128-bit numbers always map to different numbers (encryption is reversible - there are no two plaintexts which encrypt to the same ciphertext, otherwise how could you decrypt them?)
You can guarantee that, for someone who doesn't have the key, the function is one way (encryption is not reversible without they key - it would be less useful if it was)
Now, 128 bits is bigger than you want, so AES is not the right cipher for you. All you need to do is choose a cipher which has a 12-digit block size.
There aren't any conventional ciphers which have a 12-digit block size. But it's actually pretty easy to construct one. You can use the Feistel construction to take a hash function and construct a block cipher. You can either build a binary cipher of a suitable size (40 bits in your case) and then use the "hasty pudding trick" to restrict its domain to 12 digits, or construct a cipher that works directly in decimal (more or less).
I wrote an answer to another question a while ago that explained this in some more detail; i even wrote some code to implement the idea, although looking at it now, i don't know how comprehensible it is. The key classes in it are TinyCipher, which implements a cipher with a block size of up to 32 bits (and could easily be extended up to 64 bits), and TrickCipher, which uses the hasty pudding trick to implement a cipher over an arbitrary-sized set (such as all 12-digit numbers).
Let's assume I have a reliably truly random source of random numbers, but it is very slow. It only give me a few hundreds of numbers every couple of hours.
Since I need way more than that I was thinking to use those few precious TRN I can get as seeds for java.util.Random (or scala.util.Random). I also always will pick a new one to generate the next random number.
So I guess my questions are:
Can the numbers I generate from those Random instance in Java be considered truly random since the seed is truly random?
Is there still a condition that is not met for true randomness?
If I keep on adding levels at what point will randomness be lost?
Or (as I thought when I came up with it) is truly random as long as the stream of seeds is?
I am assuming that nobody has intercepted the stream of seeds, but I do not plan to use those numbers for security purposes.
For a pseudo random generator like java.util.Random, the next generated number in the sequence becomes predictable given only a few numbers from the sequence, so you will loose your "true randomness" very fast. Better use one of the generators provided by java.security.SecureRandom - these are all strong random generators with an VERY long sequence length, which should be pretty hard to be predicted.
Our java Random gives uniformly spread random numbers. That is not true randomness, which may yield five times the same number.
Furthermore for every specific seed the same sequence is generated (intentionally). With 2^64 seeds in general irrelevant. (Note hackers could store the first ten numbers of every sequence; thereby rapidly catching up.)
So if you at large intervals use a truely random number as seed, you will get a uniform distribution during that interval. In effect not very different from not using the true randomizers.
Now combining random sequences might reduce the randomness. Maybe translating the true random number to bytes, and xor-ing every new random number with another byte, might give a wilder variance.
Please do not take my word only - I cannot guarantee the mathematical correctness of the above. A math/algorithmic forum might give more info.
When you take out more bits, than you have put in they are for sure no longer truly random. The break point may even occur earlier if the random number generator is bad. This can be seen by considering the entropy of the sequences. The seed value determines the sequence completely, so there are at most as many sequences as seed values. If they are all distinct, the entropy is the same as that of the seeds (which is essentially the number of seed bits, assuming the seed is truly random).
However, if different seeds lead to the same pseudo random sequence the entropy of the sequences will be lower than that of the seeds. If we cut off the sequences after n bits, the entropy may be even lower.
But why care if you don't use it for security purposes? Are you sure the pseudo random numbers are not good enough for your application?
I need/want to get random (well, not entirely) numbers to use for password generation.
What I do: Currently I am generating them with SecureRandom.
I am obtaining the object with
SecureRandom sec = SecureRandom.getInstance("SHA1PRNG", "SUN");
and then seeding it like this
sec.setSeed(seed);
Target: A (preferably fast) way to create random numbers, which are cryptographically at least a safe as the SHA1PRNG SecureRandom implementation. These need to be the same on different versions of the JRE and Android.
EDIT: The seed is generated from user input.
Problem: With SecureRandom.getInstance("SHA1PRNG", "SUN"); it fails like this:
java.security.NoSuchProviderException: SUN. Omitting , "SUN" produces random numbers, but those are different than the default (JRE 7) numbers.
Question: How can I achieve my Target?
You don't want it to be predictable: I want, because I need the predictability so that the same preconditions result in the same output. If they are not the same, its impossible hard to do what the user expects from the application.
EDIT: By predictable I mean that, when knowing a single byte (or a hundred) you should not be able to predict the next, but when you know the seed, you should be able to predict the first (and all others). Maybe another word is reproducible.
If anyone knows of a more intuitive way, please tell me!
I ended up isolating the Sha1Prng from the sun sources which guarantees reproducibility on all versions of Java and android. I needed to drop some important methods to ensure compatibility with android, as android does not have access to nio classes...
I recommend using UUID.randomUUID(), then splitting it into longs using getLeastSignificantBits() and getMostSignificantBits()
If you want predictable, they aren't random. That breaks your "Target" requirement of being "safe" and devolves into a simple shared secret between two servers.
You can get something that looks sort of random but is predicatable by using the characteristics of prime integers where you build a set of integers by starting with I (some specific integer) and add the first prime number and then modulo by the 2nd prime number. (In truth the first and second numbers only have to be relatively prime--meaning they have no common prime factors--not counting 1, in case you call that a factor.
If you repeat the process of adding and doing the modulo, you will get a set of numbers that you can repeatably reproduce and they are ordered in the sense that taking any member of the set, adding the first prime and doing the modulo by the 2nd prime, you will always get the same result.
Finally, if the 1st prime number is large relative to the second one, the sequence is not easily predictable by humans and seems sort of random.
For example, 1st prime = 7, 2nd prime = 5 (Note that this shows how it works but is not useful in real life)
Start with 2. Add 7 to get 9. Modulo 5 to get 4.
4 plus 7 = 11. Modulo 5 = 1.
Sequence is 2, 4, 1, 3, 0 and then it repeats.
Now for real life generation of numbers that seem random. The relatively prime numbers are 91193 and 65536. (I chose the 2nd one because it is a power of 2 so all modulo-ed values can fit in 16 bits.)
int first = 91193;
int modByLogicalAnd = 0xFFFF;
int nonRandomNumber = 2345; // Use something else
for (int i = 0; i < 1000 ; ++i) {
nonRandomNumber += first;
nonRandomNumber &= modByLogicalAnd;
// print it here
}
Each iteration generates 2 bytes of sort of random numbers. You can pack several of them into a buffer if you need larger random "strings".
And they are repeatable. Your user can pick the starting point and you can use any prime you want (or, in fact, any number without 2 as a factor).
BTW - Using a power of 2 as the 2nd number makes it more predictable.
Ignoring RNGs that use some physical input (random clock bits, electrical noise, etc) all software RNGs are predicable, given the same starting conditions. They are, after all, (hopefully) deterministic computer programs.
There are some algorithms that intentionally include the physical input (by, eg, sampling the computer clock occasionally) in attempt to prevent predictability, but those are (to my knowledge) the exception.
So any "conventional" RNG, given the same seed and implemented to the same specification, should produce the same sequence of "random" numbers. (This is why a computer RNG is more properly called a "pseudo-random number generator".)
The fact that an RNG can be seeded with a previously-used seed and reproduce a "known" sequence of numbers does not make the RNG any less secure than one where your are somehow prevented from seeding it (though it may be less secure than the fancy algorithms that reseed themselves at intervals). And the ability to do this -- to reproduce the same sequence again and again is not only extraordinarily useful in testing, it has some "real life" applications in encryption and other security applications. (In fact, an encryption algorithm is, in essence, simply a reproducible random number generator.)