Efficient BigInteger multiplication modulo n in Java - java

I can calculate the multiplication of two BigIntegers (say a and b) modulo n.
This can be done by:
a.multiply(b).mod(n);
However, assuming that a and b are of the same order of n, it implies that during the calculation, a new BigInteger is being calculated, and its length (in bytes) is ~ 2n.
I wonder whether there is more efficient implementation that I can use. Something like modMultiply that is implemented like modPow (which I believe does not calculate the power and then the modulo).

I can only think of
a.mod(n).multiply(b.mod(n)).mod(n)
and you seem already to be aware of this.
BigInteger has a toByteArray() but internally ints are used. hence n must be quite large to have an effect. Maybe in key generation cryptographic code there might be such work.
Furhtermore, if you think of short-cutting the multiplication, you'll get something like the following:
public static BigInteger multiply(BigInteger a, BigInteger b, int mod) {
if (a.signum() == -1) {
return multiply(a.negate(), b, mod).negate();
}
if (b.signum() == -1) {
return multiply(a, b.negate(), mod).negate();
}
int n = (Integer.bitCount(mod - 1) + 7) / 8; // mod in bytes.
byte[] aa = a.toByteArray(); // Highest byte at [0] !!
int na = Math.min(n, aa.length); // Heuristic.
byte[] bb = b.toByteArray();
int nb = Math.min(n, bb.length); // Heuristic.
byte[] prod = new byte[n];
for (int ia = 0; ia < na; ++ia) {
int m = ia + nb >= n ? n - ia - 1 : nb; // Heuristic.
for (int ib = 0; ib < m; ++ib) {
int p = (0xFF & aa[aa.length - 1 - ia]) * (0xFF & bb[bb.length - 1 - ib]);
addByte(prod, ia + ib, p & 0xFF);
if (ia + ib + 1 < n) {
addByte(prod, ia + ib + 1, (p >> 8) & 0xFF);
}
}
}
// Still need to do an expensive mod:
return new BigInteger(prod).mod(BigInteger.valueOf(mod));
}
private static void addByte(byte[] prod, int i, int value) {
while (value != 0 && i < prod.length) {
value += prod[prod.length - 1 - i] & 0xFF;
prod[prod.length - 1 - i] = (byte) value;
value >>= 8;
++i;
}
}
That code does not look appetizing. BigInteger has the problem of exposing the internal value only as big-endian byte[] where the first byte is the most significant one.
Much better would be to have the digits in base N. That is not unimaginable: if N is a power of 2 some nice optimizations are feasible.
(BTW the code is untested - as it does not seem convincingly faster.)

First, the bad news: I couldn't find any existing Java libraries that provided this functionality.
I couldn't find any pure-Java big integer libraries ... apart from java.math.BigInteger.
There are Java / JNI wrappers for the GMP library, but GMP doesn't implement this either.
So what are your options?
Maybe there is some pure-Java library that I missed.
Maybe there some other native (C / C++) big integer library supports this operation ... though you may need to write your own JNI wrappers.
You should be able to implement such a method for yourself, by copying the source code of java.math.BigInteger and adding an extra custom method. Alternatively, it looks like you could extend it.
Having said that, I'm not sure that there is a "substantially faster" algorithm for computing a * b mod n in Java, or any other language. (Apart from special cases; e.g. when n is a power of 2).
Specifically, the "Montgomery Reduction" approach wouldn't help for a single multiplication step. (The Wikipedia page says: "Because numbers have to be converted to and from a particular form suitable for performing the Montgomery step, a single modular multiplication performed using a Montgomery step is actually slightly less efficient than a "naive" one.")
So maybe the most effective way to speedup the computation would be to use the JNI wrappers for GMP.

You can use generic maths, like:
(A*B) mod N = ((A mod N) * (B mod N)) mod N
It may be more CPU intensive, but one should choose between CPU and memory, right?
If we are talking about modular arithmetic then indeed Montgomery reduction may be what you need. Don't know any out of box solutions though.

You can write a BigInteger multiplication as a standard long multiplication in a very large base -- for example, in base 2^32. It is fairly straightforward. If you want only the result modulo n, then it is advantageous to choose a base that is a factor of n or of which n is a factor. Then you can ignore all but one or a few of the lowest-order result (Big)digits as you perform the computation, saving space and maybe time.
That's most practical if you know n in advance, of course, but such pre-knowledge is not essential. It's especially nice if n is a power of two, and it's fairly messy if n is neither a power of 2 nor smaller than the maximum operand handled directly by the system's arithmetic unit, but all of those cases can be handled in principle.
If you must do this specifically with Java BigInteger instances, however, then be aware that any approach not provided by the BigInteger class itself will incur overhead for converting between internal and external representations.

Maybe this:
static BigInteger multiply(BigInteger c, BigInteger x)
{
BigInteger sum = BigInteger.ZERO;
BigInteger addOperand;
for (int i=0; i < FIELD_ELEMENT_BIT_SIZE; i++)
{
if (c.testBit(i))
addOperand = x;
else
addOperand = BigInteger.ZERO;
sum = add(sum, addOperand);
x = x.shiftRight(1);
}
return sum;
}
with the following helper functions:
static BigInteger add(BigInteger a, BigInteger b)
{
return modOrder(a.add(b));
}
static BigInteger modOrder(BigInteger n)
{
return n.remainder(FIELD_ORDER);
}
To be honest though, I'm not sure if this is really efficient at all since none of these operations are performed in-place.

Related

ByteArray to DoubleArray in Kotlin

I want to extrapolate a byte array into a double array.
I know how to do it in Java. But the AS converter doesn't work for this... :-D
This is the class I want to write in Kotlin:
class ByteArrayToDoubleArrayConverter {
public double[] invoke(byte[] bytes) {
double[] doubles = new double[bytes.length / 2];
int i = 0;
for (int n = 0; n < bytes.length; n = n + 2) {
doubles[i] = (bytes[n] & 0xFF) | (bytes[n + 1] << 8);
i = i + 1;
}
return doubles;
}
}
This would be a typical example of what results are expected:
class ByteArrayToDoubleArrayConverterTest {
#Test
fun `check typical values`() {
val bufferSize = 8
val bytes = ByteArray(bufferSize)
bytes[0] = 1
bytes[1] = 0
bytes[2] = 0
bytes[3] = 1
bytes[4] = 0
bytes[5] = 2
bytes[6] = 1
bytes[7] = 1
val doubles = ByteArrayToDoubleArrayConverter().invoke(bytes)
assertTrue(1.0 == doubles[0])
assertTrue(256.0 == doubles[1])
assertTrue(512.0 == doubles[2])
assertTrue(257.0 == doubles[3])
}
}
Any idea? Thanks!!!
I think this would be clearest with a helper function.  Here's an extension function that uses a lambda to convert pairs of bytes into a DoubleArray:
inline fun ByteArray.mapPairsToDoubles(block: (Byte, Byte) -> Double)
= DoubleArray(size / 2){ i -> block(this[2 * i], this[2 * i + 1]) }
That uses the DoubleArray constructor which takes an initialisation lambda as well as a size, so you don't need to loop through setting values after construction.
The required function then simply needs to know how to convert each pair of bytes into a double.  Though it would be more idiomatic as an extension function rather than a class:
fun ByteArray.toDoubleSamples() = mapPairsToDoubles{ a, b ->
(a.toInt() and 0xFF or (b.toInt() shl 8)).toDouble()
}
You can then call it with e.g.:
bytes.toDoubleSamples()
(.toXxx() is the conventional name for a function which returns a transformed version of an object.  The standard name for this sort of function would be toDoubleArray(), but that normally converts each value to its own double; what you're doing is more specialised, so a more specialised name would avoid confusion.)
The only awkward thing there (and the reason why the direct conversion from Java fails) is that Kotlin is much more fussy about its numeric types, and won't automatically promote them the way Java and C do; it also doesn't have byte overloads for its bitwise operators.  So you need to call toInt() explicitly on each byte before you can call and and shl, and then call toDouble() on the result.
The result is code that is a lot shorter, hopefully much more readable, and also very efficient!  (No intermediate arrays or lists, and — thanks to the inline — not even any unnecessary function calls.)
(It's a bit more awkward than most Kotlin code, as primitive arrays aren't as well-supported as reference-based arrays — which are themselves not as well-supported as lists.  This is mainly for legacy reasons to do with Java compatibility.  But it's a shame that there's no chunked() implementation for ByteArray, which could have avoided the helper function, though at the cost of a temporary list.)

How to calculate 2 to-the-power N where N is a very large number

I need to find 2 to-the-power N where N is a very large number (Java BigInteger type)
Java BigInteger Class has pow method but it takes only integer value as exponent.
So, I wrote a method as follows:
static BigInteger twoToThePower(BigInteger n)
{
BigInteger result = BigInteger.valueOf(1L);
while (n.compareTo(BigInteger.valueOf((long) Integer.MAX_VALUE)) > 0)
{
result = result.shiftLeft(Integer.MAX_VALUE);
n = n.subtract(BigInteger.valueOf((long) Integer.MAX_VALUE));
}
long k = n.longValue();
result = result.shiftLeft((int) k);
return result;
}
My code works fine, I am just sharing my idea and curious to know if there is any other better idea?
Thank you.
You cannot use BigInteger to store the result of your computation. From the javadoc :
BigInteger must support values in the range -2^Integer.MAX_VALUE (exclusive) to +2^Integer.MAX_VALUE (exclusive) and may support values outside of that range.
This is the reason why the pow method takes an int. On my machine, BigInteger.ONE.shiftLeft(Integer.MAX_VALUE) throws a java.lang.ArithmeticException (message is "BigInteger would overflow supported range").
Emmanuel Lonca's answer is correct. But, by Manoj Banik's idea, I would like to share my idea too.
My code do the same thing as Manoj Banik's code in faster way. The idea is init the buffer, and put the bit 1 in to correct location. I using the shift left operator on 1 byte instead of shiftLeft method.
Here is my code:
static BigInteger twoToThePower(BigInteger n){
BigInteger eight = BigInteger.valueOf(8);
BigInteger[] devideResult = n.divideAndRemainder(eight);
BigInteger bufferSize = devideResult[0].add(BigInteger.ONE);
int offset = devideResult[1].intValue();
byte[] buffer = new byte[bufferSize.intValueExact()];
buffer[0] = (byte)(1 << offset);
return new BigInteger(1,buffer);
}
But it still slower than BigInteger.pow
Then, I found that class BigInteger has a method called setBit. It also accepts parameter type int like the pow method. Using this method is faster than BigInteger.pow.
The code can be:
static BigInteger twoToThePower(BigInteger n){
return BigInteger.ZERO.setBit(n.intValueExact());
}
Class BigInteger has a method called modPow also. But It need one more parameter. This means you should specify the modulus and your result should be smaller than this modulus. I did not do a performance test for modPow, but I think it should slower than the pow method.
By using repeated squaring you can achieve your goal. I've posted below sample code to understand the logic of repeated squaring.
static BigInteger pow(BigInteger base, BigInteger exponent) {
BigInteger result = BigInteger.ONE;
while (exponent.signum() > 0) {
if (exponent.testBit(0)) result = result.multiply(base);
base = base.multiply(base);
exponent = exponent.shiftRight(1);
}
return result;
}
An interesting question. Just to add a little more information to the fine accepted answer, examining the openjdk 8 source code for BigInteger reveals that the bits are stored in an array final int[] mag;. Since arrays can contain at most Integer.MAX_VALUE elements this immediately puts a theoretical bound on this particular implementation of BigInteger of 2(32 * Integer.MAX_VALUE). So even your method of repeated left-shifting can only exceed the size of an int by at most a factor of 32.
So, are you ready to produce your own implementation of BigInteger?

more efficient Fibonacci for BigInteger

I am working on a class project to create a more efficient Fibonacci than the recursive version of Fib(n-1) + Fib(n-2). For this project I need to use BigInteger. So far I have had the idea to use a map to store the previous fib numbers.
public static BigInteger theBigFib(BigInteger n) {
Map<BigInteger, BigInteger> store = new TreeMap<BigInteger, BigInteger>();
if (n.intValue()<= 2){
return BigInteger.ONE;
}else if(store.containsKey(n)){
return store.get(n);
}else{
BigInteger one = new BigInteger("1");
BigInteger two = new BigInteger("2");
BigInteger val = theBigFib(n.subtract(one)).add(theBigFib(n.subtract(two)));
store.put(n,val);
return val;
}
}
I think that the map is storing more than it should be. I also think this line
BigInteger val = theBigFib(n.subtract(one)).add(theBigFib(n.subtract(two)));
is an issue. If anyone could shed some light on what i'm doing wrong or possible another solution to make it faster than the basic code.
Thanks!
You don't need all the previous BigIntegers, you just need the last 2.
Instead of a recursive solution you can use a loop.
public static BigInteger getFib(int n) {
BigInteger a = new BigInteger.ONE;
BigInteger b = new BigInteger.ONE;
if (n < 2) {
return a;
}
BigInteger c = null;
while (n-- >= 2) {
c = a.add(b);
a = b;
b = c;
}
return c;
}
If you want to store all the previous values, you can use an array instead.
static BigInteger []memo = new BigInteger[MAX];
public static BigInteger getFib(int n) {
if (n < 2) {
return new BigInteger("1");
}
if (memo[n] != null) {
return memo[n];
}
memo[n] = getFib(n - 1).add(getFib(n - 2));
return memo[n];
}
If you just want the nth Fib value fast and efficient.
You can use the matrix form of fibonacci.
A = 1 1
1 0
A^n = F(n + 1) F(n)
F(n) F(n - 1)
You can efficiently calculate A^n using Exponentiation by Squaring.
I believe the main issue in your code is that you create a new Map on each function call. Note that it's still local variable, despite that your method is static. So, you're guaranteed that the store.containsKey(n) condition never holds and your solution is not better than naive. I.e. it still has exponential complexity of n. More precisely, it takes about F(n) steps to get to the answer (basically because all "ones" that make up your answer are returned by some function call).
I'd suggest making the variable a static field instead of a local variable. Then number of calls should become linear instead of exponential and you will see a significant improvement. Other solutions include for loop with three variables which iteratively calculate Fibonacci numbers from 0, 1, 2 up to n-th and the best solutions I know involve matrix exponentiation or explicit formula with real numbers (which is bad for precision), but it's a question better suited for computer science StackExchange website, imho.

Using XOR Shift as a faster CRC32 checksum?

Is it valid to use XOR shift to produce a usable checksum? I can't find any evidence that it collides more than say CRC32.
I did run a simulation on 10 million randomly generated 8 to 32 length byte arrays and the hash32 method below actually produced 2% less collisions than CRC32.
Also, the code seems to run about 40x faster than Java's built-in util.zip.CRC32 class.
public static long hash64( byte[] bytes )
{
long x = 1;
for ( int i = 0; i < bytes.length; i++ )
{
x ^= bytes[ i ];
x ^= ( x << 21 );
x ^= ( x >>> 35 );
x ^= ( x << 4 );
}
return x;
}
public static int hash32( byte[] bytes )
{
int x = 1;
for ( int i = 0; i < bytes.length; i++ )
{
x ^= bytes[ i ];
x ^= ( x << 13 );
x ^= ( x >>> 17 );
x ^= ( x << 5 );
}
return x;
}
Yes, if all you need is a simple file checksum, it's a completely valid alternative, but it's not the best solution.
CRCs are optimized for reliably detecting burst errors, not collision resistance or uniform distribution. CRC-32 may superficially appear to work as a general hash function or a checksum, but it readily fails avalanche and collision tests, as you've seen in your test. CRC is also quite slow because it must implement polynomial division, which requires expensive operations, even when heavily optimized into shift operations. Table versions of CRC which utilize lookup tables (LUT) are also slow in interpreted languages such as Java due to unavoidable bounds-checking and conditional checks under the hood for each lookup.
Your solution is to take Xorshift, a pseudorandom function (PRF), and transform it into a hash function. On the surface, this may seem to pass basic collision tests, but it is not a very good choice. Its avalanche behavior is quite poor, and so there is a greater-than-chance probability of collisions that your tests aren't sensitive enough to find. Not only that, but it is sub-optimal, reading only one byte at a time. Better solutions exist with comparable performance.
A much better choice is 64-bit MurmurHash3, it performs quite well in Java when sufficiently optimized. It may even be faster than your solution for large inputs. I also recommend reading Bret Mulvey's article on Hash Functions. It explains how hash functions are constructed and tested in a digestible way.

BigInteger most time optimized multiplication

Hi I want to multiply 2 big integer in a most timely optimized way. I am currently using karatsuba algorithm. Can anyone suggest more optimized way or algo to do it.
Thanks
public static BigInteger karatsuba(BigInteger x, BigInteger y) {
// cutoff to brute force
int N = Math.max(x.bitLength(), y.bitLength());
System.out.println(N);
if (N <= 2000) return x.multiply(y); // optimize this parameter
// number of bits divided by 2, rounded up
N = (N / 2) + (N % 2);
// x = a + 2^N b, y = c + 2^N d
BigInteger b = x.shiftRight(N);
BigInteger a = x.subtract(b.shiftLeft(N));
BigInteger d = y.shiftRight(N);
BigInteger c = y.subtract(d.shiftLeft(N));
// compute sub-expressions
BigInteger ac = karatsuba(a, c);
BigInteger bd = karatsuba(b, d);
BigInteger abcd = karatsuba(a.add(b), c.add(d));
return ac.add(abcd.subtract(ac).subtract(bd).shiftLeft(N)).add(bd.shiftLeft(2*N));
}
The version of BigInteger in jdk8 switches between the naive algorithm, The Toom-Cook algorithm, and Karatsuba depending on the size of the input to get excellent performance.
Complexity and actual speed are very different things in practice, because of the constant factors involved in the O notation. There is always a point where complexity prevails, but it may very well be out of the range (of input size) you are working with. The implementation details (level of optimization) of an algorithm also directly affect those constant factors.
My suggestion is to try a few different algorithms, preferably from a library that the authors already spent some effort optimizing, and actually measure and compare their speeds on your inputs.
Regarding SPOJ, don't forget the possibility that the main problem lies elsewhere (i.e. not in the multiplication speed of large integers).

Categories