Port of Random generator from C to Java? - java

George Marsaglia has written an excellent random number generator that is extremely fast, simple, and has a much higher period than the Mersenne Twister. Here is the code with a description:
good C random number generator
I wanted to port the CMWC4096 code to Java, but it uses several unsigned datatypes so I am not sure how to do this properly. Here is the full C code:
/* choose random initial c<809430660 and */
/* 4096 random 32-bit integers for Q[] */
static unsigned long Q[4096],c=362436;
unsigned long CMWC4096(void) {
unsigned long long t, a=18782LL;
static unsigned long i=4095;
unsigned long x,r=0xfffffffe;
i = (i+1) & 4095;
t = a*Q[i] + c;
c = (t>>32);
x = t + c;
if (x < c) {
x++;
c++;
}
return (Q[i] = r - x);
}
Can anyone port this to Java? How does this work when you only have signed numbers available?
EDIT: Thanks everybody for the quick answers! For the first 100 million numbers this java code seems to produce the same result as the C code. It is 3 times faster than Java's java.util.Random.
public class ComplimentaryMultiplyWithCarryRandom {
/**
* Choose 4096 random 32-bit integers
*/
private long[] Q;
/**
* choose random initial c<809430660
*/
private long c = 362436;
private int i;
public ComplimentaryMultiplyWithCarryRandom() {
Random r = new Random(1);
Q = new long[4096];
// TODO initialize with real random 32bit values
for (int i = 0; i < 4096; ++i) {
long v = r.nextInt();
v -= Integer.MIN_VALUE;
Q[i] = v;
}
i = 4095;
}
int next() {
i = (i + 1) & 4095;
long t = 18782 * Q[i] + c;
c = t >>> 32;
long x = (t + c) & 0xffffffffL;
if (x < c) {
++x;
++c;
}
long v = 0xfffffffeL - x;
Q[i] = v;
return (int) v;
}
}

Most of the time there is no need to use larger numeric types for simulating unsigned types in Java.
For addition, subtraction, multiplication, shift left, the logical operations, equality
and casting to a smaller numeric type
it doesn't matter whether the operands are signed or unsigned,
the result will be the same regardless, viewed as a bit pattern.
For shifting to the right use >> for signed, >>> for unsigned.
For signed casting to a larger type just do it.
For unsigned casting from a smaller type to a long use & with a mask of type long for the smaller type.
E.g., short to long: s & 0xffffL.
For unsigned casting from a smaller type to an int use & with a mask of type int.
E.g., byte to int: b & 0xff.
Otherwise do like in the int case and apply a cast on top.
E.g., byte to short: (short) (b & 0xff).
For the comparison operators < etc. and division the easiest is to cast to a larger type and do the operation there.
But there also exist other options, e.g. do comparisons after adding an appropriate offset.

Can anyone port this to Java? How does
this work when you only have signed
numbers available?
No Stress! a=18782 so the largest t could ever be is not large enough to cause signed vs. unsigned problems. You would have to "upgrade" the result of using Q to a value equal to a 32-bit unsigned number before using it anywhere. e.g. if Q is an int (32-bit signed) then you'd have to do this before using it in the t=a*Q[i]+c statement, e.g.
t=a*(((long)Q[i])&0xffffffffL)+c
where this (((long)Q[i])&0xffffffffL) business promotes Q[i] to a 64-bit # and ensures its high 32 bits are 0's. (edit: NOTE: you need 0xffffffffL here. Java does the wrong thing if you use 0xffffffff, it seems like it "optimizes" itself to the wrong answer & you get a negative number if Q[i]'s high bit is 1.)
You should be able to verify this by running the algorithms in C++ and Java to compare the outputs.
edit: here's a shot at it. I tried running it in C++ and Java for N=100000; they both match. Apologies if I used bad Java idioms, I'm still fairly new to Java.
C++:
// marsaglia2003.cpp
#include <stdio.h>
#include <stdlib.h> // for atoi
class m2003
{
enum {c0=362436, sz=4096, mask=4095};
unsigned long Q[sz];
unsigned long c;
short i;
public:
m2003()
{
// a real program would seed this with a good random seed
// i'm just putting in something that makes the output interesting
for (int j = 0; j < sz; ++j)
Q[j] = j + (j << 16);
i = 4095;
c = c0;
}
unsigned long next()
{
unsigned long long t, a=18782LL;
unsigned long x;
unsigned long r=0xfffffffe;
i = (i+1)&mask;
t=a*Q[i]+c;
c=(unsigned long)(t>>32);
x=(unsigned long)t + c;
if (x<c)
{
x++;
c++;
}
return (Q[i]=r-x);
}
};
int main(int argc, char *argv[])
{
m2003 generator;
int n = 100;
if (argc > 1)
n = atoi(argv[1]);
for (int i = 0; i < n; ++i)
{
printf("%08x\n", generator.next());
}
return 0;
}
java: (slower than compiled C++ but it matches for N=100000)
// Marsaglia2003.java
import java.util.*;
class Marsaglia2003
{
final static private int sz=4096;
final static private int mask=4095;
final private int[] Q = new int[sz];
private int c=362436;
private int i=sz-1;
public Marsaglia2003()
{
// a real program would seed this with a good random seed
// i'm just putting in something that makes the output interesting
for (int j = 0; j < sz; ++j)
Q[j] = j + (j << 16);
}
public int next()
// note: returns a SIGNED 32-bit number.
// if you want to use as unsigned, cast to a (long),
// then AND it with 0xffffffffL
{
long t, a=18782;
int x;
int r=0xfffffffe;
i = (i+1)&mask;
long Qi = ((long)Q[i]) & 0xffffffffL; // treat as unsigned 32-bit
t=a*Qi+c;
c=(int)(t>>32);
// because "a" is relatively small this result is also small
x=((int)t) + c;
if (x<c && x>=0) // tweak to treat x as unsigned
{
x++;
c++;
}
return (Q[i]=r-x);
}
public static void main(String args[])
{
Marsaglia2003 m2003 = new Marsaglia2003();
int n = 100;
if (args.length > 0)
n = Integer.parseInt(args[0]);
for (int i = 0; i < n; ++i)
{
System.out.printf("%08x\n", m2003.next());
}
}
};

If you are implementing an RNG in Java, it is best to sub-class the java.util.Random class and over-ride the protected next(int) method (your RNG is then a drop-in replacement for java.util.Random). The next(int) method is concerned with randomly-generated bits, not what vales those bits might represent. The other (public) methods of java.util.Random use these bits to construct random values of different types.

To get around Java's lack of unsigned types you usually store numbers in a bigger variable type (so shorts get upgraded to ints, ints to long). Since you're using long variables here, you're going to have to step up to BigInteger, which will probably wreck any speed gains that you're getting out of the algorithm.

You can use signed numbers provided the values don't overflow...for example long in java is a 64 bit signed integer. However the intent in this algorithm seems to be to use a 64 bit unsigned value, and if so I think you would be out of luck with the basic types.
You could use the multiprecision integers provided in the java class libraries (BigInteger). Or you could implement your own 64 bit unsigned type as an Object containing two java longs to represent the least significant and most significant words (but you'd have to implement the basic arithmetic operations yourself in the class).

Note: In your C code, I inferred that long is 32 bits wide, and long long is 64 bits wide.
Here is my way of porting that code to Java with the minimum number of changes:
/* choose random initial 0<=c<809430660 and */
/* 4096 random 32-bit integers for Q[] */
int[] Q = new int[4096];
int c = 362436;
int i = 4095;
int CMWC4096() {
long a = 18782;
int r = 0xfffffffe;
i = (i + 1) & 4095;
long t = a * Q[i] + c;
c = (int)(t >>> 32);
int x = (int)(t + c);
if (0 <= x && x < c) {
x++;
c++;
}
return (Q[i] = r - x);
}

Related

Decoding an Int32 in Java for the Mantissa (Significant Bits) using bitwise logic

I know this sounds like a question that has been asked before but in my case, I am being asked to do some bitwise logic for an int32 value and interpreting that as a floating-point, not converting the value to a floating-point but interpreting the binary value associated with it as a float and finding the significant bits (also known as the mantissa) in the set. I am using IEE-754 standard for the conversion.
The instructions are:
"Use a bitmask to isolate the significant digits, then use a loop to iterate over each digit, shifting as necessary to get them in the LSB position, and multiplying by the appropriate power-of-2. The result will be a float without an exponent.
Don't forget to include the "hidden" bit!
Consider a loop that counts down -- craft it carefully and it will let you start at the LSB and work your way to the MSB."
From my understanding of this, I crafted this abomination
public static float decodeSignificantDigits(int value) {
int mask = 0x007FFFFF;
int significantBits = value & mask;
float significantDigits = 0;
for (int i = -23; i <= 0; i++) {
if (i != 0) {
significantDigits += (significantBits >> 1) * Math.pow(2, i);
} else {
significantDigits += 1 * Math.pow(2, i);
}
}
System.out.println(significantDigits);
return significantDigits;
}
I think I am on the right track here but I just can not visualize how to get this to work correctly.
The main idea is to get the following tests to pass:
#Test
void testWhenAllOnes() {
int bits = 0b00000000011111111111111111111111;
int bitsWithExpoentZero = 0b00111111111111111111111111111111;
assertEquals(Float.intBitsToFloat(bitsWithExpoentZero), FloatDecoder.decodeSignificantDigits(bits), TOL);
}
#Test
void testWhenMsbOfSignificantDigitsIsOneRestZeroes() {
int bits = 0b00000000010000000000000000000000;
int bitsWithExpoentZero = 0b00111111110000000000000000000000;
assertEquals(Float.intBitsToFloat(bitsWithExpoentZero), FloatDecoder.decodeSignificantDigits(bits), TOL);
}
#Test
void testWhenLsbOfSignificantDigitsIsOneRestZeroes() {
int bits = 0b00000000000000000000000000000001;
int bitsWithExpoentZero = 0b00111111100000000000000000000001;
assertEquals(Float.intBitsToFloat(bitsWithExpoentZero), FloatDecoder.decodeSignificantDigits(bits), TOL);
}
I was also informed not to use Integer.toBinaryString() for my solution.

Why does the factorial past 20 not work in java when using a long?

I am creating a program that will find factorials using the long primitive type, only when doing the factorial for 21 it does not work. The answer it gives me -4249290049419214848 when the answer should be 5109094217000000000 and the max value for long is 9223372036854775807. I do not know why it will not give me 5109094217000000000 when that number is smaller than 9223372036854775807.
Here is my code
long j = 1;
for(int i = 1; i <= 21; i++){
j *= i;
}
System.out.println(j);
21! is not 5,109,094,217,000,000,000.
It is 51,090,942,171,709,440,000.
That is bigger than Long.MAX_VALUE, 9,223,372,036,854,775,807. Hence it overflows.
This may help:
long: The long data type is a 64-bit two's complement integer. The signed long has a minimum value of -263 and a maximum value of 263-1. In Java SE 8 and later, you can use the long data type to represent an unsigned 64-bit long, which has a minimum value of 0 and a maximum value of 264-1. Use this data type when you need a range of values wider than those provided by int. The Long class also contains methods like compareUnsigned, divideUnsigned etc to support arithmetic operations for unsigned long.
From: https://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html
An idea from: http://www.javawithus.com/programs/factorial-using-big-integer
Factorials greater than or equal to 21 create an overflow, so it's necessary to use something else. Big-Integer is ideal.
This is more or less the implementation:
import java.math.BigInteger;
import java.util.Scanner;
public class Factorial2 {
public static void main(String[] args) {
Scanner s = new Scanner(System.in);
System.out.print("Enter a number: ");
int n = s.nextInt();
String fact = factorial(n);
System.out.println("Factorial is " + fact);
}
public static String factorial(int n) {
BigInteger fact = new BigInteger("1");
for (int i = 1; i <= n; i++) {
fact = fact.multiply(new BigInteger(i + ""));
}
return fact.toString();
}
}
Long variable overflows.. 21! is 51090942171709440000 that is bigger than Long.MAX_VALUE
long product = 1;
for(int i = 1; i <= 21; i++){
product = product * i;
System.out.println(product);
if(product < 0 ) {
System.out.println("Overflow");
}
}

Java results differ for (int)Math.pow(2,x) and 1<<x

Why do the following two operations yield different results in Java for x = 31 or 32 but the same results for x=3?
int x=3;
int b = (int) Math.pow(2,x);
int c = 1<<x;
Results:
x=32: b=2147483647; c=1;
x=31: b=2147483647; c=-2147483648;
x=3: b=8 ; c=8
There are multiple issues at play:
An int can only store values between -2147483648 and 2147483647.
1 << x only uses the lowest five bits of x. Thus, 1 << 32 is by definition the same as 1 << 0.
Shift operations are performed on the two's-complement integer representation of the value of the left operand; this explains why 1 << 31 is negative.
Math.pow(2, 32) returns a double.
(int)(d), where d is a double greater than 2147483647 returns 2147483647 ("the largest representable value of type int").
What this interview question does is show that (int)Math.pow(2, x) and 1 << x are not equivalent for values of x outside the 0...30 range.
P.S. It is perhaps interesting to note that using long in place of int (and 1L in place of 1) would give yet another set of results different from the other two. This holds even if the final results are converted to int.
According to the documentation Math.pow will promote both of its arguments to double and return double. Obviously when the returned result is double and you cast it to int you'll get only the highest 32 bits and the rest will be truncated - hence you always get the (int) Math.pow(2,x); value. When you do bitshift you always work with ints and hence an overflow occurs.
Consider the limits of the type int. How large a number can it hold?
Here's a micro-benchmark for the case of a long. On my laptop (2.8GHz), using shift instead of Math.pow is over 7x faster.
int limit = 50_000_000;
#Test
public void testPower() {
Random r = new Random(7);
long t = System.currentTimeMillis();
for (int i = 0; i < limit; i++) {
int p = r.nextInt(63);
long l = (long)Math.pow(2,p);
}
long t1 = System.currentTimeMillis();
System.out.println((t1-t)/1000.0); // 3.758 s
}
#Test
public void testShift() {
Random r = new Random(7);
long t = System.currentTimeMillis();
for (int i = 0; i < limit; i++) {
int p = r.nextInt(63);
long l = 1L << p;
}
long t1 = System.currentTimeMillis();
System.out.println((t1-t)/1000.0); // 0.523 s
}
int is 32 bits in size and since it is signed (by default), the first bit is used for the sign. When you shift left 31 bits, you get the Two's Compliment, which is -(2^32). When you shift left 32 bits, it just loops all the way back around to 1. If you were to do this shifting with longs instead of ints, you would get the answers you expect (that is until you shift 63+ bits).

how to implement hash function `h(k) = (A·k mod 2^w) >> (w – r)` in Java

IMPORTANT NOTICE:
This is not a discussion thread for people to give me their opinion about hashing. I just need to know how to make the given function work in java -- an example would be best.
PROBLEM:
Trying to hone my understanding of hash functions for a pending interview, I watch two free lectures by MIT computer science professors (http://videolectures.net/mit6046jf05_leiserson_lec08/). So after the lecture, I am trying to implement the following hash function in java.
h(k) = (A·k mod 2^w) >> (w – r)
WHERE
r: m, the size of the array, is a power of 2 such that m=2^r
w: the computer has w-bit words, such as 32-bit or 64-bit computer
k: the value I am to find a key for
A: a random odd number (prime would be great) between 2^(w-1) and 2^w
I thought this would be easy to implement in java. But when I do 2^w where w=32, I get inaccurate results in Java. In real life 2^32 = 4294967296 but not in java, which truncates the result to 2^31 - 1 or 2147483647.
Does anyone know how to fix this problem so to implement the function in Java?
EDIT:
I see a lot of the replies focus on 32. What if my computer is 64 bit? I am stuck with setting w = 32 because I am using Java?
Some of the terms are redundant because Java assumes this behaviour anyway.
A·k mod 2^w
In Java, integer multiplication overflows and thus does a mod 2^w (with a sign). The fact that it has a sign doesn't matter if you are then shifting by at least one bit.
Shift of (w - r) is the same as a shift of -r in Java (the w is implied by the type)
private static final int K_PRIME = (int) 2999999929L;
public static int hash(int a, int r) {
// return (a * K_PRIME % (2^32)) >>> (32 - r);
return (a * K_PRIME) >>> -r;
}
for 64-bit
private static final long K_PRIME = new BigInteger("9876534021204356789").longValue();
public static long hash(long a, int r) {
// return (a * K_PRIME % (2^64)) >>> (64 - r);
return (a * K_PRIME) >>> -r;
}
I have written this example to show you can do the same thing in BigInteger and why you wouldn't. ;)
public static final BigInteger BI_K_PRIME = new BigInteger("9876534021204356789");
private static long K_PRIME = BI_K_PRIME.longValue();
public static long hash(long a, int r) {
// return (a * K_PRIME % (2^64)) >>> (64 - r);
return (a * K_PRIME) >>> -r;
}
public static long biHash(long a, int r) {
return BigInteger.valueOf(a).multiply(BI_K_PRIME).mod(BigInteger.valueOf(2).pow(64)).shiftRight(64 - r).longValue();
}
public static void main(String... args) {
Random rand = new Random();
for (int i = 0; i < 10000; i++) {
long a = rand.nextLong();
for (int r = 1; r < 64; r++) {
long h1 = hash(a, r);
long h2 = biHash(a, r);
if (h1 != h2)
throw new AssertionError("Expected " + h2 + " but got " + h1);
}
}
int runs = 1000000;
long start1 = System.nanoTime();
for (int i = 0; i < runs; i++)
hash(i, i & 63);
long time1 = System.nanoTime() - start1;
long start2 = System.nanoTime();
for (int i = 0; i < runs; i++)
biHash(i, i & 63);
long time2 = System.nanoTime() - start2;
System.out.printf("hash with long took an average of %,d ns, " +
"hash with BigInteger took an average of %,d ns%n",
time1 / runs, time2 / runs);
}
prints
hash with long took an average of 3 ns, \
hash with BigInteger took an average of 905 ns
Neither int nor long would be sufficiently large enough to hold all of the values you'd need in 2^(w-1). You would be best served with BigInteger.
Let's look what number % 2^32 actually does: It gets the remainder of the division by 2^32. If you have a range from 0 to 2^32, the computer will automatically do the modulo for you, because it throws away everything above 2^32.
Let's take 8 instead of 32, and switch to binary number system:
1000 1000 % 1 0000 0000 = 1000 1000
1 1000 1000 % 1 0000 0000 = 1000 1000
So what you should do is to limit the number to the range of the computer. If you would use e.g. c++, it would be as simple as declaring the value as unsigned int. The first 1 of the second example above would simply be truncated because it does not fit into the variable.
In java, you don't have unsigned integers. If you calculate A * k, and that results in an overflow, you may get a signed value. But as the only thing you have to do next is to do a right shift, this should not matter.
So my suggestion is to simply drop the modulo calculation. Try it, I'm not quite sure whether it works.
The Java primative int has a range of minimum value of -2,147,483,648 and a maximum value of 2,147,483,647
Check out this link for details on the primatives.
I recommend using a long instead of an int.

best way to reverse bytes in an int in java

What's the best way to reverse the order of the 4 bytes in an int in java??
You can use Integer.reverseBytes:
int numBytesReversed = Integer.reverseBytes(num);
There's also Integer.reverse that reverses every bit of an int
int numBitsReversed = Integer.reverse(num);
java.lang.Integer API links
public static int reverseBytes(int i)
Returns the value obtained by reversing the order of the bytes in the two's complement representation of the specified int value.
public static int reverse(int i)
Returns the value obtained by reversing the order of the bits in the two's complement binary representation of the specified int value.
Solution for other primitive types
There are also some Long, Character, and Short version of the above methods, but some are notably missing, e.g. Byte.reverse. You can still do things like these:
byte bitsRev = (byte) (Integer.reverse(aByte) >>> (Integer.SIZE - Byte.SIZE));
The above reverses the bits of byte aByte by promoting it to an int and reversing that, and then shifting to the right by the appropriate distance, and finally casting it back to byte.
If you want to manipulate the bits of a float or a double, there are Float.floatToIntBits and Double.doubleToLongBits that you can use.
See also
Wikipedia/Bitwise operation
Bit twiddling hacks
I agree that polygenelubricants's answer is the best one. But just before I hit that, I had the following:
int reverse(int a){
int r = 0x0FF & a;
r <<= 8; a >>= 8;
r |= 0x0FF & a;
r <<= 8; a >>= 8;
r |= 0x0FF & a;
r <<= 8; a >>= 8;
r |= 0x0FF & a;
return r;
}
shifting the input right, the output left by 8 bits each time and OR'ing the least significant byte to the result.

Categories