Deterministic shuffle in Objective C - java

This code in Java is the implementation of Knuth's shuffle, but a deterministic one, controllable by the seed to the random number generator.
public String shuffleString(String data, long shuffleSeed) {
if(shuffleSeed!=0) {
Random rnd = new Random(shuffleSeed);
StringBuilder sb = new StringBuilder(data);
int n = data.length();
while(n>1) {
int k = rnd.nextInt(n--);
char t = sb.charAt(n);
sb.setCharAt(n, sb.charAt(k));
sb.setCharAt(k, t);
}
return sb.toString();
}
else {
return data;
}
}
How can I implement a deterministic shuffle in Objective C that outputs the same shuffle order given the same seed? I am using srandom(_shuffleSeed); and random()%(n--) knowing that arc4_random is better but that cannot be seeded.
- (NSString*) shuffleString:(NSString*) data withShuffleSeed:(int) shuffleSeed {
if(shuffleSeed!=0) {
srandom(_shuffleSeed);
NSMutableString *result = [[NSMutableString alloc] initWithString:data];
unsigned long n = data.length;
while(n>1) {
unsigned long k = random()%(n--);
unichar t = [result characterAtIndex:n];
NSRange r1 = {n,1};
[result replaceCharactersInRange:r1 withString:[NSString stringWithFormat:#"%c", [result characterAtIndex:k]]];
NSRange r2 = {k,1};
[result replaceCharactersInRange:r2 withString:[NSString stringWithFormat:#"%c", t]];
}
return result;
}
else {
return data;
}
}
Currently, the two shuffle methods do not generate the same result for the same input parameters. I am sure I am missing something!

There are many pseudo-random number generating algorithms that use a seed. You can't assume that the one in the Java standard library uses exactly the same algorithm as srandom/random in Objective C.
The Java random generator uses:
The class uses a 48-bit seed, which is modified using a linear
congruential formula. (See Donald Knuth, The Art of Computer
Programming, Volume 3, Section 3.2.1.)
It doesn't make any more guarantees, although it is never changed for backward compatibility reasons.
Your options are:
Take the Java source and convert it to Objective-C (or hope that somebody else has done this before). Note that the Java source is licensed under the GPL, or a restrictive Oracle license. If you take the version under the GPL license, that has an impact on the license that you can use for your own code.
Search for the source of the random generator in Objective-C and convert that to Java. (Which may also have license restrictions, and the source may not be available). Or maybe the algorithm is more properly specified so you can implement it in Java solely from the documentation.
Find another random generator that has a Java and an Object-C implementation that give identical results (or write one)

Related

How can I maintain probability across multiple executions in Java

Firstly I am not the greatest with Math, so please excuse any ignorance relating to that. I am trying to maintain probability based randomness across multiple executions but I am failing. I have this input in a JSONObject
{
"option1": 25,
"option2":25,
"option3" :10,
"option4" :40
}
This is my function that selects a value from the above JSONObject based on the probability assigned:
public static String selectRandomoptions(JSONObject options) {
String selectedOption = null;
if (options != null) {
int maxChance = 0;
for (String option : options.keySet()) {
maxChance += options.getInt(option);
}
if (maxChance < 100) {
maxChance = 100;
}
Random r = new Random();
Integer randomValue = r.nextInt(maxChance);
int chance = 0;
for (String option : options.keySet()) {
chance += options.getInt(option);
if (chance >= randomValue) {
selectedOption = options.toLowerCase();
break;
}
}
}
}
the function behaves within a reasonable error margin if I call it x amount of times in a single execution ( tested 100+ calls), the problem is that I am running this every hour to generates some sample data in an event-driven app to verify our analytics process/data but we need it to be somewhat predictable, at least within a reasonable margin?
Has anyone any idea how I might approach this? I would rather not have to persist anything but I am not opposed to it if it makes sense or reduces complexity/time.
The values returned by Random.nextInt() are uniformly distributed, so that shouldn't be a problem.
I you would like to make random results repeatable, then you may want to use Random with seed.
Rather than create a new Random() object each time you want a new random number, just create the Random object once per run, and use the Random.nextInt() object once per run.
Looking at the documentation of Random() constructor,
This constructor sets the seed of the random number generator to a
value very likely to be distinct from any other invocation of this
constructor.it only guarantees it to be different
that's a bit of a weaker contract than the number you get from nextInt().
If you want to get the same sequence of numbers on each run, use the Random(long seed) or the setSeed(long seed) method of the random object. Both these methods set the seed of the generator. If you used the same seed for each invocation it's guaranteed that you will get the same sequence of numbers from the generator.
Random.setSeed(long).

Fast real valued random generator in java

java.util.Random.nextDouble() is slow for me and I need something really fast.
I did some google search and I've found only integers based fast random generators. Is here anything for real numbers from interval <0, 1) ?
If you need something fast and have access to Java8, I can recommend the java.utils SplittableRandom. It is faster (~twice as fast) and has better statistical distribution.
If you need a even faster or better algorithm I can recommend one of these specialized XorShift variants:
XorShift128PlusRandom (faster & better)
XorShift1024StarPhiRandom (similar speed, even longer period)
Information on these algorithms and their quality can be found in this big PRNG comparison.
I made an independent Performance comparison you can find the detailed results and the code here: github.com/tobijdc/PRNG-Performance
Futhermore Apache Commons RNG has a performance test of all their implemented algoritms
TLDR
Never use java.util.Random, use java.util.SplittableRandom.
If you need faster or better PRNG use a XorShift variant.
You could modify an integer based RNG to output doubles in the interval [0,1) in the following way:
double randDouble = randInt()/(RAND_INT_MAX + 1.0)
However, if randInt() generates a 32-bit integer this won't fill all the bits of the double because double has 53 mantissa bits. You could obviously generate two random integers to fill all mantissa bits. Or you could take a look at the source code of the Ramdom.nextDouble() implementation. It almost surely uses an integer RNG and simply converts the output to a double.
As for performance, the best-performing random number generators are linear congruential generators. Of these, I recommend using the Numerical Recipes generator. You can see more information about LCGs from Wikipedia: http://en.wikipedia.org/wiki/Linear_congruential_generator
However, if you want good randomness and performance is not that important, I think Mersenne Twister is the best choice. It also has a Wikipedia page: http://en.wikipedia.org/wiki/Mersenne_Twister
There is a recent random number generator called PCG, explained in http://www.pcg-random.org/. This is essentially a post-processing step for LCG that improves the randomness of the LCG output. Note that PCG is slower than LCG because it is simply a post-processing step for LCG. Thus, if performance is very important and randomness quality not that important, you want to use LCG instead of PCG.
Note that none of the generators I mentioned are cryptographically secure. If you need use the values for cryptographical applications, you should be using a cryptographically secure algorithm. However, I don't really believe that doubles would be used for cryptography.
Note that all these solutions miss a fundamental fact (that I wasn't aware of up to a few weeks ago): passing from 64 bits to a double using a multiplication is a major loss of time. The implementation of xorshift128+ and xorshift1024+ in the DSI utilities (http://dsiutils.di.unimi.it/) use direct bit manipulation and the results are impressive.
See the benchmarks for nextDouble() at
http://dsiutils.di.unimi.it/docs/it/unimi/dsi/util/package-summary.html#package.description
and the quality reported at
http://prng.di.unimi.it/
Imho you should just accept juhist's answer - here's why.
nextDouble is slow because it makes two calls to next() - it's written right there in the documentation.
So your best options are:
use a fast 64 bit generator, convert that to double (MT, PCG, xorshift*, ISAAC64, ...)
generate doubles directly
Here's an overly long benchmark with java's Random, an LCG (as bad as java.util.Random), and Marsaglia's universal generator (the version generating doubles).
import java.util.*;
public class d01 {
private static long sec(double x)
{
return (long) (x * (1000L*1000*1000));
}
// ns/op: nanoseconds to generate a double
// loop until it takes a second.
public static double ns_op(Random r)
{
long nanos = -1;
int n;
for(n = 1; n < 0x12345678; n *= 2) {
long t0 = System.nanoTime();
for(int i = 0; i < n; i++)
r.nextDouble();
nanos = System.nanoTime() - t0;
if(nanos >= sec(1))
break;
if(nanos < sec(0.1))
n *= 4;
}
return nanos / (double)n;
}
public static void bench(Random r)
{
System.out.println(ns_op(r) + " " + r.toString());
}
public static void main(String[] args)
{
for(int i = 0; i < 3; i++) {
bench(new Random());
bench(new LCG64(new Random().nextLong()));
bench(new UNI_double(new Random().nextLong()));
}
}
}
// straight from wikipedia
class LCG64 extends java.util.Random {
private long x;
public LCG64(long seed) {
this.x = seed;
}
#Override
public long nextLong() {
x = x * 6364136223846793005L + 1442695040888963407L;
return x;
}
#Override
public double nextDouble(){
return (nextLong() >>> 11) * (1.0/9007199254740992.0);
}
#Override
protected int next(int nbits)
{
throw new RuntimeException("TODO");
}
}
class UNI_double extends java.util.Random {
// Marsaglia's UNIversal random generator extended to double precision
// G. Marsaglia, W.W. Tsang / Statistics & Probability Letters 66 (2004) 183 – 187
private final double[] U = new double[98];
static final double r=9007199254740881.0/9007199254740992.;
static final double d=362436069876.0/9007199254740992.0;
private double c=0.;
private int i=97,j=33;
#Override
public double nextDouble(){
double x;
x=U[i]- U[j];
if(x<0.0)
x=x+1.0;
U[i]=x;
if(--i==0) i=97;
if(--j==0) j=97;
c=c-d;
if(c<0.0)
c=c+r;
x=x-c;
if(x<0.)
return x+1.;
return x;
}
//A two-seed function for filling the static array U[98] one bit at a time
private
void fillU(int seed1, int seed2){
double s,t;
int x,y,i,j;
x=seed1;
y=seed2;
for (i=1; i<98; i++){
s= 0.0;
t=0.5;
for (j=1; j<54; j++){
x=(6969*x) % 65543;
// typo in the paper:
//y=(8888*x) % 65579;
//used forthe demo in the last page of the paper.
y=(8888*y) % 65579;
if(((x^y)& 32)>0)
s=s+t;
t=.5*t;
}
if(x == 0)
throw new IllegalArgumentException("x");
if(y == 0)
throw new IllegalArgumentException("y");
U[i]=s;
}
}
// Marsaglia's test code is useless because of a typo in fillU():
// x=(6969*x)%65543;
// y=(8888*x)% 65579;
public UNI_double(long seed)
{
Random r = new Random(seed);
for(;;) {
try {
fillU(r.nextInt(), r.nextInt());
break;
} catch(Exception e) {
// loop again
}
}
}
#Override
protected int next(int nbits)
{
throw new RuntimeException("TODO");
}
}
You could create an array of random doubles when you init your program and then just repeat it. This is much faster but the random values reapeat themselfs.

Efficient non-repeating password generator in Java

I have gone to this site many times and found answers to my questions but its finally time for me to post one of my own! So the objective of a particular class in my software is to generate random passwords of fixed length, comprised of 'low' ASCII characters. The main catch is that I do not want to generate the same password twice but always guarantee uniqueness. Initially I used a HashMap in order to hash each password I had generated so far and use as a check each time I created a new one before returning. However, Java HashMap objects are limited in size and eventually the Map would become too saturated to maintain acceptable retrieval time. The following is my latest crack at the problem:
package gen;
import java.util.Set;
import java.util.Random;
import java.util.HashSet;
public class Generator {
Random r;
int length;
Set<String> seen;
public Generator(int l){
seen = new HashSet<String>();
length = l;
r = new Random();
r.setSeed(System.currentTimeMillis());
}
public String generate(){
String retval = "";
int i = 0;
while(i<length){
int rand = r.nextInt(93)+33;
if(rand!=96){
retval+= (char)rand;
i++;
}
}
return retval;
}
public String generateNoRepeat(){
String retval;
int i;
do{
retval ="";
i = 0;
while(i<length){
int rand = r.nextInt(93)+33;
if(rand!=96){
retval+= (char)rand;
i++;
}
}
}while(!seen.add(retval));
return retval;
}
}
Edit: Thanks so much for the Set suggestion. It makes my code so much cleaner now too!
I may decide to just use the dumb generator method to fill up a BlockingQueue and just multithread it to death...
Further clarification: This is not meant to generate secure passwords. It must simply guarantee that it will eventually generate all possible passwords and only once for a given length and character set.
Note:
I have taken everyone's insight and have come to the conclusion that sequentially generating the possible passwords and storing them to the disk is probably my best option. Either that or simply allow duplicate passwords and supplement the inefficiency with multiple Generator threads.
Why not just encrypt sequential numbers?
Let n be the first number in your sequence (don't start with zero). Let e be some encryption algorithm (e.g. RSA).
Then your passwords are e(n), e(n+1), e(n+2), ...
But I heavily agree with Greg Hewgill and Ted Hopp, avoiding duplicates is more trouble than it is worth.

How to create user friendly unique IDs, UUIDs or other unique identifiers in Java

I usually use the UUID class to generate unique IDs. This works fine if these IDs are used by technical systems only, they don't care how long they are:
System.out.println(UUID.randomUUID().toString());
> 67849f28-c0af-46c7-8421-94f0642e5d4d
Is there a nice way to create user friendly unique IDs (like those from tinyurl) which are a bit shorter than the UUIDs? Usecase: you want to send out IDs via Mail to your customers which in turn visit your site and enter that number into a form, like a voucher ID.
I assume that UUIDs get generated equally through the whole range of the 128 Bit range of the UUID. So would it be sage to use just the lower 64 Bits for instance?
System.out.println(UUID.randomUUID().getLeastSignificantBits());
Any feedback is welcome.
I assume that UUIDs get generated
equally through the whole range of the
128 Bit range of the UUID.
First off, your assumption may be incorrect, depending on the UUID type (1, 2, 3, or 4). From the Java UUID docs:
There exist different variants of
these global identifiers. The methods
of this class are for manipulating the
Leach-Salz variant, although the
constructors allow the creation of any
variant of UUID (described below).
The layout of a variant 2 (Leach-Salz)
UUID is as follows: The most
significant long consists of the
following unsigned fields:
0xFFFFFFFF00000000 time_low
0x00000000FFFF0000 time_mid
0x000000000000F000 version
0x0000000000000FFF time_hi
The least significant long consists of
the following unsigned fields:
0xC000000000000000 variant
0x3FFF000000000000 clock_seq
0x0000FFFFFFFFFFFF node
The variant field contains a value
which identifies the layout of the
UUID. The bit layout described above
is valid only for a UUID with a
variant value of 2, which indicates
the Leach-Salz variant.
The version field holds a value that
describes the type of this UUID. There
are four different basic types of
UUIDs: time-based, DCE security,
name-based, and randomly generated
UUIDs. These types have a version
value of 1, 2, 3 and 4, respectively.
The best way to do what you're doing is to generate a random string with code that looks something like this (source):
public class RandomString {
public static String randomstring(int lo, int hi){
int n = rand(lo, hi);
byte b[] = new byte[n];
for (int i = 0; i < n; i++)
b[i] = (byte)rand('a', 'z');
return new String(b, 0);
}
private static int rand(int lo, int hi){
java.util.Random rn = new java.util.Random();
int n = hi - lo + 1;
int i = rn.nextInt(n);
if (i < 0)
i = -i;
return lo + i;
}
public static String randomstring(){
return randomstring(5, 25);
}
/**
* #param args
*/
public static void main(String[] args) {
System.out.println(randomstring());
}
}
If you're incredibly worried about collisions or something, I suggest you base64 encode your UUID which should cut down on its size.
Moral of the story: don't rely on individual parts of UUIDs as they are holistically designed. If you do need to rely on individual parts of a UUID, make sure you familiarize yourself with the particular UUID type and implementation.
Here is another approach for generating user friendly IDs:
http://thedailywtf.com/Articles/The-Automated-Curse-Generator.aspx
(But you should go for the bad-word-filter)
Any UUID/Guid is just 16 Bytes of data. These 16 bytes can be easily encoded using BASE64 (or BASE64url), then stripped off all of the "=" characters at the end of the string.
This gives a nice, short string which still holds the same data as the UUID/Guid. In other words, it is possible to recreate the UUID/Guid from that data if such becomes necessary.
Here's a way to generate a URL-friendly 22-character UUID
public static String generateShortUuid() {
UUID uuid = UUID.randomUUID();
long lsb = uuid.getLeastSignificantBits();
long msb = uuid.getMostSignificantBits();
byte[] uuidBytes = ByteBuffer.allocate(16).putLong(msb).putLong(lsb).array();
// Strip down the '==' at the end and make it url friendly
return Base64.encode(uuidBytes)
.substring(0, 22)
.replace("/", "_")
.replace("+", "-");
}
For your use-case, it would be better to track a running count of registered user, and for each value, generate a string-token like this:
public static String longToReverseBase62(long value /* must be positive! */) {
final char[] LETTERS = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ".toCharArray();
StringBuilder result = new StringBuilder(9);
do {
result.append(LETTERS[(int)(value % 62)]);
value /= 62l;
}
while (value != 0);
return result.toString();
}
For security reasons, it would be better if you make the values non-sequential, so each time a user registers, you can increment the value let's say by 1024 (This would be good to generate uuids for 2^64 / 2^10 = 2^54 users which is quite certainly more than you'd ever need :)
At the time of this writing, this question's title is:
How to create user friendly unique IDs, UUIDs or other unique identifiers in Java
The question of generating a user-friendly ID is a subjective one. If you have a unique value, there are many ways to format it into a "user-friendly" one, and they all come down to mapping unique values one-to-one with "user-friendly" IDs — if the input value was unique, the "user-friendly" ID will likewise be unique.
In addition, it's not possible in general to create a random value that's also unique, at least if each random value is generated independently of any other. In addition, there are many things you should ask yourself if you want to generate unique identifiers (which come from my section on unique random identifiers):
Can the application easily check identifiers for uniqueness within the desired scope and range (e.g., check whether a file or database record with that identifier already exists)?
Can the application tolerate the risk of generating the same identifier for different resources?
Do identifiers have to be hard to guess, be simply "random-looking", or be neither?
Do identifiers have to be typed in or otherwise relayed by end users?
Is the resource an identifier identifies available to anyone who knows that identifier (even without being logged in or authorized in some way)?
Do identifiers have to be memorable?
In your case, you have several conflicting goals: You want identifiers that are unique, random, and easy to type by end users. But other things you should think about are:
Are other users allowed to access the resource identified by the ID, whenever they know the ID? If not, then additional access control or a longer key length will be necessary.
Can your application tolerate the risk of duplicate keys? If so, then the keys can be completely randomly generated (such as by a cryptographic RNG such as java.security.SecureRandom in Java). If not, then your goal will be harder to achieve, especially for keys intended for security purposes.
Also, if you want IDs that have to be typed in by end users, you should consider choosing a character set carefully or allowing typing mistakes to be detected.
Only for you :) :
private final static char[] idchars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789".toCharArray();
private static String createId(int len) {
char[] id = new char[len];
Random r = new Random(System.currentTimeMillis());
for (int i = 0; i < len; i++) {
id[i] = idchars[r.nextInt(idchars.length)];
}
return new String(id);
}
How about this one? Actually, this code returns 13 characters(numbers and lowercase alphabets) max.
import java.nio.ByteBuffer;
import java.util.UUID;
/**
* Generate short UUID (13 characters)
*
* #return short UUID
*/
public static String shortUUID() {
UUID uuid = UUID.randomUUID();
long l = ByteBuffer.wrap(uuid.toString().getBytes()).getLong();
return Long.toString(l, Character.MAX_RADIX);
}

.net 3.5:how to implement calculate crc32 for data using dot net 3.5 api?

how to implement calculate crc32 for data using dot net 3.5 api.I have done the same in java with following code.
public static String getMAC (byte [] value) {
java.util.zip.CRC32 crc32 = new java.util.zip.CRC32 ();
crc32.update(value);
long newCRC = crc32.getValue();
String crcString = Long.toHexString(newCRC);
try {
crcString = ISOUtil.padleft(Long.toHexString(newCRC), 8, '0');
}
catch (Exception e){
e.printStackTrace();
}
if (ISOConstantsLibrary.DEBUG) System.out.println("for hex string: " + ISOUtil.hexString(value) + "\nmac" + crcString);
}
The .NET FCL doesn't provide a class like java.util.zip.CRC32 so you have to roll your own. It's not tough, there's some decent examples out there:
Example 1
Example 2
There is no standard CRC32 algorithm in .NET library. You can use other implementations like this. Also, if there are no specific requirements to use Crc32 (e.g. you just need to calculate checksum for checking integrity), I recommend to use Crc32C. This algorithm can be performed on modern CPUs, as result it usually 10x times faster than native implementation or 20x-60x times faster than any managed implementation.

Categories