I'm attempting to generate a random string of length X. I want to ensure that no two sequences are ever identically produced, even if this code is being run on multiple machines and at the same time.
"list" is the list of characters I'm using, "min" and "max" are the range of indexes, and "length" is the desired length of the String.
Currently, I am using System.nanoTime() as the seed for the Random object. While I realize it is likely improbable for 2 machines to run at the exact same time down to the nano second and produce the same output, I want to make this a foolproof solution.
How can I increase the randomness of this code so that no 2 strings will ever be the same without increasing the length of the string or increasing the number of characters available to be included in the string?
String seq = "";
for (int i = 0; i < length; i++)
{
Random rnd = new Random();
rnd.setSeed(System.nanoTime());
int randomNum = rnd.nextInt((max - min) + 1) + min;
seq = seq + list[randomNum];
}
return seq;
This is not possible in principle: if you generate a String with length n of k different characters there are exactly k^n possible Strings. Once you generated as much Strings, repetitions will occur, in practice much earlier.
When running on a single machine, you might remember generated Strings and only output new ones, but on two machines without synchronization even this will not be possible.
Furthermore, taking system nanos into account will not help, since the same Strings might occur in different positions of the generated sequences.
But if you are asking that the sequence of the generated Strings must differ for two machines, your solution is probably fine, but ...
there might be a correlation between the boot times of the involved machines which can in turn increase the chance of a collision of System.nanoTime().
As the Javadoc for System.nanoTime() says, the accuracy of the returned long might be worse than the precision, i.e. not every possible long value might be returned actually.
BTW, new Random()would have the same effect as your code, since System.nanoTime() is used internally for seeding in this case.
You could use SecureRandom or the built-in UUID system.
The UUID library generates unique random strings for you.
(see https://docs.oracle.com/javase/7/docs/api/java/util/UUID.html)
Using UUIDs:
import java.util.UUID;
public class GetRandString {
public static void main(String[] args) {
UUID uuid = UUID.randomUUID();
String randString = uuid.toString();
System.out.println("Random string: " + randString);
}
}
You second option is SecureRandom (https://docs.oracle.com/javase/8/docs/api/java/security/SecureRandom.html).
Another stackoverflow question answers how to generate a SecureRandom string: How to generate a SecureRandom string of length n in Java?
Related
I would like to implement a logic based on a provided string I have to generate two random numbers between 1 to 10.
I have a string like Johnsen using it I have to generate two numbers like 1 and 3 and next time with the same string it should give the same numbers for the same string.
Need help to develop this algorithm or logic.
Java has random number generator support via the java.util.Random class. This class 'works' by having a seed value, then giving you some random data based on this seed value, which then updates the seed value to something new.
This pragmatically means:
2 instances of j.u.Random with the same seed value will produce the same sequence of values if you invoke the same sequence of calls on it to give you random data.
But, seed values are of type long - 64 bits worth of data.
Thus, to do what you want, you need to write an algorithm that turns any String into a long.
Given that long, you simply make an instance of j.u.Random with that long as seed, using the new Random(seedValue) constructor.
So that just leaves: How do I turn a string into a long?
Easy way
The simplest answer is to invoke hashCode() on them. But, note, hashcodes only have 32 bits of info (they are int, not long), so this doesn't cover all possible seed values. This is unlikely to matter unless you're doing this for crypto purposes. If you ARE, then you need to stop what you are doing and do a lot more research, because it is extremely easy to mess up and have working code that seems to test fine, but which is easy to hack. You don't want that. For starters, you'd want SecureRandom instead, but that's just the tip of the iceberg.
Harder way
Hashing algorithms exist that turn arbitrary data into fixed size hash representations. The hashCode algorithm of string [A] only makes 32-bits worth of hash, and [B] is not cryptographically secure: If you task me to make a string that hashes to a provided value I can trivially do so; a cryptographically secure hash has the property that I can't just cook you up a string that hashes to a desired value.
You can search the web for hashing strings or byte arrays (you can turn a string into one with str.getBytes(StandardCharsets.UTF_8)).
You can 'collapse' a byte array containing a hash into a long also quite easily - just take any 8 bytes in that hash and use them to construct a long. "Turn 8 bytes into a long" also has tons of tutorials if you search the web for it.
I assume the easy way is good enough for this exercise, however.
Thus:
String key = ...;
Random rnd = new Random(key.hashCode());
int number1 = rnd.nextInt(10) + 1;
int number2 = rnd.nextInt(10) + 1;
System.out.println("First number: " + number1);
System.out.println("Second number: " + number2);
You could get the hashcode of the string, then use that to seed a random number generator. Use the RNG to get numbers in the range 1 - 10.
I need to generate a reservation code of 6 alpha numeric characters, that is random and unique in java.
Tried using UUID.randomuuid().toString(), However the id is too long and the requirement demands that it should only be 6 characters.
What approaches are possible to achieve this?
Just to clarify, (Since this question is getting marked as duplicate).
The other solutions I've found are simply generating random characters, which is not enough in this case. I need to reasonably ensure that a random code is not generated again.
Consider using the hashids library to generate salted hashes of integers (your database ids or other random integers which is probably better).
http://hashids.org/java/
Hashids hashids = new Hashids("this is my salt",6);
String id = hashids.encode(1, 2, 3);
long[] numbers = hashids.decode(id);
You have 36 characters in the alphanumeric character set (0-9 digits + a-z letters). With 6 places you achieve 366 = 2.176.782.336 different options, that is slightly larger than 231.
Therefore you can use Unix time to create a unique ID. However, you must assure that no ID generated within the same second.
If you cannot guarantee that, you end up with a (synchronized) counter within your class. Also, if you want to survive a JVM restart, you should save the current value (e.g. to a database, file, etc. whatever options you have).
Despite its name, UUIDs are not unique. It's simply extremely unlikely to get a 128 bit collision. With 6 (less than 32 bit) it's very likely that you get a collision if you just hash stuff or generate a random string.
If the uniqueness constraint is necessary then you need to
generate a random 6 character string
Check if you generated that string before by querying your database
If you generated it before, go back to 1
Another way would be to use a pseadorandom permutation (PRP) of size 32 bit. Block ciphers are modeled as PRP functions, but there aren't many that support 32 bit block sizes. Some are Speck by the NSA and the Hasty Pudding Cipher.
With a PRP you could for example take an already unique value like your database primary key and encrypt it with the block cipher. If the input is not bigger than 32 bit then the output will still be unique.
Then you would run Base62 (or at least Base 41) over the output and remove the padding characters to get a 6 character output.
if you do a substring that value may not be unique
for more info please see following similar link
Generating 8-character only UUIDs
Lets say your corpus is the collection of alpha numberic letters. a-zA-Z0-9.
char[] corpus = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789".toCharArray();
We can use SecureRandom to generate a seed, which will ask the OS for entropy, depending on the os. The trick here is to keep a uniform distribution, each byte has 255 values, but we only need around 62 so I will propose rejection sampling.
int generated = 0;
int desired=6;
char[] result= new char[desired];
while(generated<desired){
byte[] ran = SecureRandom.getSeed(desired);
for(byte b: ran){
if(b>=0&&b<corpus.length){
result[generated] = corpus[b];
generated+=1;
if(generated==desired) break;
}
}
}
Improvements could include, smarter wrapping of generated values.
When can we expect a repeat? Lets stick with the corpus of 62 and assume that the distribution is completely random. In that case we have the birthday problem. That gives us N = 62^6 possiblities. We want to find n where the chance of a repeat around 10%.
p(r)= 1 - N!/(N^n (N-n)!)
And using the approximation given in the wikipedia page.
n = sqrt(-ln(0.9)2N)
Which gives us about 109000 numbers for 10% chance. For a 0.1% chance it woul take about 10000 numbers.
you can trying to make substring out of your generated UUID.
String uuid = UUID.randomUUID().toString();
System.out.println("uuid = " + uuid.substring(0,5);
I have a user case which involves generating a number which a user enters into a website to link a transaction to their account.
So I have the following code which generates a random 12 digit number:
public String getRedemptionCode(long utid, long userId) {
long nano = System.nanoTime();
long temp = nano + utid + 1232;
long redemptionCode = temp + userId + 5465;
if (redemptionCode < 0) {
redemptionCode = Math.abs(redemptionCode);
}
String redemptionCodeFinal = StringUtils.rightPad(String.valueOf(redemptionCode), 12, '1');
redemptionCodeFinal = redemptionCodeFinal.substring(0, 12);
return redemptionCodeFinal;
}
This method takes in two params which are generated by a DB.
What I need to understand is:
Is this random? I have a test which ran this method 1 million times and it always seem to be random.
Can I cut this down to 8 characters?
No it is neither unique nor random.
It is not "random" in the sense of highly entropic / uncorrelated with other values.
The only source of non-determinism is System.nanoTime, so all the entropy comes from a few of the least significant bits of the system clock. Simply adding numbers like 1232 and 5465 does not make the result less correlated with subsequent results.
Is this random? I have a test which ran this method 1 million times and it always seem to be random.
If this code is used in multiple threads on the same machine, or on multiple machines with synced clocks, you will see duplicates more quickly.
Since there is low entropy, you are likely to see duplicates by random chance fairly quickly. Math.se addresses the likelihood depending on how many of these you generate.
Can I cut this down to 8 characters?
Only if you don't lose entropy. Consider two ways of truncating a timestamp:
long time = ...; // Least significant bits have randomness.
String s = "" + time;
// Cut off the right-most, most entropic bits
String bad = time.substring(0, 8);
// Cut off the left-most, least entropic bits
String better = time.substring(time.length() - 8);
Since it is a straightforward calculation from an increasing counter, an attacker who can try multiple times can predict the value produced in a way that they would not be able to had you used a crypto-strong random number generator like java.util.SecureRandom.
Is this random?
You are asking, is your function based on System.nanoTime() a random number generator (RNG)?
The definition of RNG is: generator, which generates numbers that lack any pattern.
So, are numbers returned from your function without any pattern?
No, they have an easily-observable pattern, because they depend on System.nanoTime() (system clock).
Can I cut this down to 8 characters?
Yes, you can, but it's still not random. Adding or padding won't help too.
Use SecureRandom instead.
Namely, it will never generate more than 16 even numbers in a row with some specific upperBound parameters:
Random random = new Random();
int c = 0;
int max = 17;
int upperBound = 18;
while (c <= max) {
int nextInt = random.nextInt(upperBound);
boolean even = nextInt % 2 == 0;
if (even) {
c++;
} else {
c = 0;
}
}
In this example the code will loop forever, while when upperBound is, for example, 16, it terminates quickly.
What can be the reason of this behavior? There are some notes in the method's javadoc, but I failed to understand them.
UPD1: The code seems to terminate with odd upper bounds, but may stuck with even ones
UPD2:
I modified the code to capture the statistics of c as suggested in the comments:
Random random = new Random();
int c = 0;
long trials = 1 << 58;
int max = 20;
int[] stat = new int[max + 1];
while (trials > 0) {
while (c <= max && trials > 0) {
int nextInt = random.nextInt(18);
boolean even = nextInt % 2 == 0;
if (even) {
c++;
} else {
stat[c] = stat[c] + 1;
c = 0;
}
trials--;
}
}
System.out.println(Arrays.toString(stat));
Now it tries to reach 20 evens in the row - to get better statistics, and the upperBound is still 18.
The results turned out to be more than surprising:
[16776448, 8386560, 4195328, 2104576, 1044736,
518144, 264704, 132096, 68864, 29952, 15104,
12032, 1792, 3072, 256, 512, 0, 256, 0, 0]
At first it decreases as expected by the factor of 2, but note the last line! Here it goes crazy and the captured statistics seem to be completely weird.
Here is a bar plot in log scale:
How c gets the value 17 256 times is yet another mystery
http://docs.oracle.com/javase/6/docs/api/java/util/Random.html:
An instance of this class is used to generate a stream of
pseudorandom numbers. The class uses a 48-bit seed, which is modified
using a linear congruential formula. (See Donald Knuth, The Art of
Computer Programming, Volume 3, Section 3.2.1.)
If two instances of Random are created with the same seed, and the
same sequence of method calls is made for each, they will generate and
return identical sequences of numbers. [...]
It is a pseudo-random number generator. This means that you are not actually rolling a dice but rather use a formula to calculate the next "random" value based on the current random value. To creat the illusion of randomisation a seed is used. The seed is the first value used with the formula to generate the random value.
Apparently javas random implementation (the "formula"), does not generate more than 16 even numbers in a row.
This behaviour is the reason why the seed is usually initialized with the time. Deepending on when you start your program you will get different results.
The benefits of this approach are that you can generate repeatable results. If you have a game generating "random" maps, you can remember the seed to regenerate the same map if you want to play it again, for instance.
For true random numbers some operating systems provide special devices that generate "randomness" from external events like mousemovements or network traffic. However i do not know how to tap into those with java.
From the Java doc for secureRandom:
Many SecureRandom implementations are in the form of a pseudo-random
number generator (PRNG), which means they use a deterministic
algorithm to produce a pseudo-random sequence from a true random seed.
Other implementations may produce true random numbers, and yet others
may use a combination of both techniques.
Note that secureRandom does NOT guarantee true random numbers either.
Why changing the seed does not help
Lets assume random numbers would only have the range 0-7.
Now we use the following formula to generate the next "random" number:
next = (current + 3) % 8
the sequence becomes 0 3 6 1 4 7 2 5.
If you now take the seed 3 all you do is to change the starting point.
In this simple implementation that only uses the previous value, every value may occur only once before the sequence wraps arround and starts again. Otherwise there would be an unreachable part.
E.g. imagine the sequence 0 3 6 1 3 4 7 2 5. The numbers 0,4,7,2 and 5 would never be generated more than once(deepending on the seed they might be generated never), since once the sequence loops 3,6,1,3,6,1,... .
Simplified pseudo random number generators can be thought of a permutation of all numbers in the range and you use the seed as a starting point. If they are more advanced you would have to replace the permutation with a list in which the same numbers might occur multiple times.
More complex generators can have an internal state, allowing the same number to occur several times in the sequence, since the state lets the generator know where to continue.
The implementation of Random uses a simple linear congruential formula. Such formulae have a natural periodicity and all sorts of non-random patterns in the sequence they generate.
What you are seeing is an artefact of one of these patterns ... nothing deliberate. It is not an example of bias. Rather it is an example of auto-correlation.
If you need better (more "random") numbers, then you need to use SecureRandom rather than Random.
And the answer to "why was it implemented that way is" ... performance. A call to Random.nextInt can be completed in tens or hundreds of clock cycles. A call to SecureRandom is likely to be at least 2 orders of magnitude slower, possibly more.
For portability, Java specifies that implementations must use the inferior LCG method for java.util.Random. This method is completely unacceptable for any serious use of random numbers like complex simulations or Monte Carlo methods. Use an add-on library with a better PRNG algorithm, like Marsaglia's MWC or KISS. Mersenne Twister and Lagged Fibonacci Generators are often OK as well.
I'm sure there are Java libraries for these algorithms. I have a C library with Java bindings if that will work for you: ojrandlib.
Reference: link text
i cannot understand the following line , can anybody provide me some example for the below statement?
If two instances of Random are created with the same seed, and the same sequence of method calls is made for each, they will generate and return identical sequences of numbers
Since you asked for an example:
import java.util.Random;
public class RandomTest {
public static void main(String[] s) {
Random rnd1 = new Random(42);
Random rnd2 = new Random(42);
System.out.println(rnd1.nextInt(100)+" - "+rnd2.nextInt(100));
System.out.println(rnd1.nextInt()+" - "+rnd2.nextInt());
System.out.println(rnd1.nextDouble()+" - "+rnd2.nextDouble());
System.out.println(rnd1.nextLong()+" - "+rnd2.nextLong());
}
}
Both Random instances will always have the same output, no matter how often you run it, no matter what platform or what Java version you use:
30 - 30
234785527 - 234785527
0.6832234717598454 - 0.6832234717598454
5694868678511409995 - 5694868678511409995
The random generator is deterministic. Given the same input to Random and the same usage of the methods in Random, the sequence of pseudo-random numbers returned to your program will be the same even in different runs on different machines.
This is why it is pseudo-random - the numbers returned behave statistically like random numbers except they can be reliably predicted. True random numbers are unpredictable.
The Random class basically is a Psuedorandom Number Generator (also known as Deterministic random bit generator) that generates a sequence of numbers that approximates the properties of random numbers. It's not generally random but deterministic as it can be determined by small random states in the generator (such as seed). Because of the deterministic nature, you can generate identical result if you the sequence of methods and seeds are identical on 2 generators.
The numbers are not really random, given the same starting conditions (the seed) and the same sequence of operations, the same sequence of numbers will be generated. This is why it would not be a good iea to use the basic Random class as part of any cryptograhic or security related code since it may be possible for an attacker to figure out which sequnce is being generated and predict future numbers.
For a random number generator that emits non-deterministic values, take a look at SecureRandom.
See Random number generation, Computational methods on wikipedia for more info.
This means that when you create the Random object (e.g. at the start of your program), you will probably want to start with a new seed. Mostly people choose some time related value, such as the number of ticks.
The fact that the number sequences are the same given the same seed is actually very convenient if you want to debug your program: make sure you log the seed value and if something is wrong you can restart the program in the debugger using that same seed value. This means you can replay the scenario exactly. This would be impossible if you would (could) use a true random number generator.
With the same seed value, separate instances of Random will return/generate the same sequence of random numbers; more on this here:
http://www.particle.kth.se/~lindsey/JavaCourse/Book/Part1/Tech/Chapter04/javaRandNums.html
Ruby Example:
class LCG; def initialize(seed=Time.now.to_i, a=2416, b=374441, m=1771075); #x, #a, #b, #m = seed % m, a, b, m; end; def next(); #x = (#a * #x + #b) % #m; end; end
irb(main):004:0> time = Time.now.to_i
=> 1282908389
irb(main):005:0> r = LCG.new(time)
=> #<LCG:0x0000010094f578 #x=650089, #a=2416, #b=374441, #m=1771075>
irb(main):006:0> r.next
=> 45940
irb(main):007:0> r.next
=> 1558831
irb(main):008:0> r.next
=> 1204687
irb(main):009:0> f = LCG.new(time)
=> #<LCG:0x0000010084cb28 #x=650089, #a=2416, #b=374441, #m=1771075>
irb(main):010:0> f.next
=> 45940
irb(main):011:0> f.next
=> 1558831
irb(main):012:0> f.next
=> 1204687
Based on the values a/b/m, the result will be the same for a given seed. This can be used to generate the same "random" number in two places and both sides can depend on getting the same result. This can be useful for encryption; although obviously, this algorithm isn't cryptographically secure.