What do I do when a Long isn't long enough (Java)?

What do I do when a Long isn't long enough (Java)? - java

So, I'm working through the bottom of Project Euler. There are a lot of great problems there (most of which are way above my level). But I very quickly came across a major problem:
Java will not do huge numbers.
Now, if I had the ability to make fast programs taking alternative routes, this would not be a problem for me. Unfortunately, I am not that person.
Take, for example, Problem 20. It call to find the digit sum of 100! I wasn't even writing for the digit parsing, but just my code to find the factorial failed.
long onehfact = 1;
for(int i = 1; i <= 100; i++){
onehfact = onehfact * i;
System.out.println(onehfact);
}
This worked for about 20 sequences, but started giving random-ish 19 digit numbers. By 75 sequences, it was just giving me zero.
100! is definitely not zero.
I need a way to have incredibly large numbers, and scientific notation will not work for my purposes. Is there a large variable type that I could use, that can hold numbers as large as 100-150 digits? Is there any other solution that would have the same result?

There is a class called BigInteger that will allow you to handle and perform operations on numbers of any size. It will allocate a variable amount of memory for the digits.
Here is the documentation. The class is in the java.math package:
http://docs.oracle.com/javase/7/docs/api/java/math/BigInteger.html
I worked through some problems in Project Euler in Java, too. For some of the other problems you are dealing with large non-integer numbers and you need the related class java.math.BigDecimal:
http://docs.oracle.com/javase/7/docs/api/java/math/BigDecimal.html

You can use BigInteger to satisfy your requirement.
BigInteger onehfact = BigInteger.valueOf(1L);
for(long i = 1; i <= 100; i++){
onehfact = onehfact.multiply(i);
System.out.println(onehfact);
}

Related

Is there a way to pow 2 BigInteger Numbers in java?

I have to pow a bigInteger number with another BigInteger number.
Unfortunately, only one BigInteger.pow(int) is allowed.
I have no clue on how I can solve this problem.

I have to pow a bigInteger number with another BigInteger number.
No, you don't.
You read a crypto spec and it seemed to say that. But that's not what it said; you didn't read carefully enough. The mathematical 'universe' that the math in the paper / spec you're reading operates in is different from normal math. It's a modulo-space. All operations are implicitly performed modulo X, where X is some number the crypto algorithm explains.
You can do that just fine.
Alternatively, the spec is quite clear and says something like: C = (A^B) % M and you've broken that down in steps (... first, I must calculate A to the power of B. I'll worry about what the % M part is all about later). That's not how that works - you can't lop that operation into parts. (A^B) % M is quite doable, and has its own efficient algorithm. (A^B) is simply not calculable without a few years worth of the planet's entire energy and GDP output.
The reason I know that must be what you've been reading, is because (A ^ B) % M is a common operation in crypto. (Well, that, and the simple fact that A^B can't be done).
Just to be crystal clear: When I say impossible, I mean it in the same way 'travelling faster than the speed of light' is impossible. It's a law in the physics sense of the word: If you really just want to do A^B and not in a modspace where B is so large it doesn't fit in an int, a computer cannot calculate it, and the result will be gigabytes large. int can hold about 9 digits worth. Just for fun, imagine doing X^Y where both X and Y are 20 digit numbers.
The result would have 10^21 digits.
That's roughly equal to the total amount of disk space available worldwide. 10^12 is a terabyte. You're asking to calculate a number where, forget about calculating it, merely storing it requires one thousand million harddisks each of 1TB.
Thus, I'm 100% certain that you do not want what you think you want.
TIP: If you can't follow the math (which is quite bizarre; it's not like you get modulo-space math in your basic AP math class!), generally rolling your own implementation of a crypto algorithm isn't going to work out. The problem with crypto is, if you mess up, often a unit test cannot catch it. No; someone will hack your stuff and then you know, and that's a high price to pay. Rely on experts to build the algorithm, spend your time ensuring the protocol is correct (which is still quite difficult to get right, don't take that lightly!). If you insist, make dang sure you have a heap of plaintext+keys / encrypted (or plaintext / hashed, or whatever it is you're doing) pairs to test against, and assume that whatever you wrote, even if it passes those tests, is still insecure because e.g. it is trivial to leak the key out of your algorithm using timing attacks.

Since you anyway want to use it in a modulo operation with a prime number, like #Progman said in the comments, you can use modPow()
Below is an example code:
// Create BigInteger objects
BigInteger biginteger1, biginteger2, exponent, result;
//prime number
int pNumber = 5;
// Intializing all BigInteger Objects
biginteger1 = new BigInteger("23895");
biginteger2 = BigInteger.valueOf(pNumber);
exponent = new BigInteger("15");
// Perform modPow operation on the objects and exponent
result = biginteger1.modPow(exponent, biginteger2);

Fermats factorization method not functioning

I'm working on a program to compare different algorithms for factorization of large integers. One of the algorithms I'm including in the comparison is Fermats factorization method. The algorithm seems to work just fine for small numbers, but when I get larger numbers I get weird results.
Here's my code:
public void fermat(long n)
{
ArrayList<Long> factors = new ArrayList<Long>();
a = (long)Math.ceil(Math.sqrt(n));
b = a*a - n;
b_root = (long)(Math.sqrt(b)+0.5);
while(b_root*b_root != b)
{
a++;
b = a*a - n;
b_root = (long)(Math.sqrt(b)+0.5);
}
factors.add(a-b_root);
factors.add(a+b_root);
}
Now, when I try to factor 42139523531366663 I get the resulting factors 6194235479 and 2984853201, which is incorrect since 6194235479 * 2984853201 = 18488883597240918279. I figured that I got this result because somewhere in the algorithm I got to a point where the numbers became too big for a long or something similar, so the algorithm got a bit messed up because of that. I added a check which calculated the product of the two factors and compared with the input value, so that I'd get an alert if the factorization was faulty:
long x,y;
x = factors.get(0);
y = factors.get(1);
if(x*y!=n)
System.out.println("Faulty factorization.");
Interestingly enough, the check passed as true and I didn't get the alert. I tried just printing the result of the multiplication and this actually resulted in the input value. So my question is why does my program behave like this, and what can I do about it?

It looks like there is an overflow in a long somewhere, because longs have 64 bits and
42139523531366663 + 2^64 = 18488883597240918279
For sufficiently large numbers, you may need switch to using BigInteger.

Is it because there's an error in multiplying large numbers too?
That may be a valid enough reason. This is what makes the program think that it's factorization is right, but when you actually multiply the numbers without using the program, you discover the error.

library for integer factorization in java or scala

There are a lot of questions about how to implement factorization, however for production use, I would rather use an open source library to get something efficient and well tested right away.
The method I am looking for looks like this:
static int[] getPrimeFactors(int n)
it would return {2,2,3} for n=12
A library may also have an overload for handling long or even BigInteger types
The question is not about a particular application, it is about having a library which handles well this problem. Many people argue that different implementations are needed depending on the range of the numbers, in this regard, I would expect that the library select the most reasonable method at runtime.
By efficient I don't mean "world fastest" (I would not work on the JVM for that...), I just mean dealing with int and long range within a second rather than a hour.

It depends what you want to do. If your needs are modest (say, you want to solve Project Euler problems), a simple implementation of Pollard's rho algorithm will find factors up to ten or twelve digits instantly; if that's what you want, let me know, and I can post some code. If you want a more powerful factoring program that's written in Java, you can look at the source code behind Dario Alpern's applet; I don't know about a test suite, and it's really not designed with an open api, but it does have lots of users and is well tested. Most of the heavy-duty open-source factoring programs are written in C or C++ and use the GMP big-integer library, but you may be able to access them via your language's foreign function interface; look for names like gmp-ecm, msieve, pari or yafu. If those don't satisfy you, a good place to ask for more help is the Mersenne Forum.

If you want to solve your problem, rather than get what you are asking for, you want a table. You can precompute it using silly slow methods, store it, and then look up the factors for any number in microseconds. In particular, you want a table where the smallest factor is listed in an index corresponding to the number--much more memory efficient if you use trial division to remove a few of the smallest primes--and then walk your way down the table until you hit a 1 (meaning no more divisors; what you have left is prime). This will take only two bytes per table entry, which means you can store everything on any modern machine more hefty than a smartphone.
I can demonstrate how to create this if you're interested, and show how to check that it is correct with greater reliability than you could hope to achieve with an active community and unit tests of a complex algorithm (unless you ran the algorithm to generate this table and verified that it was all ok).

I need them for testing if a polynomial is primitive or not.
This is faster than trying to find the factors of all the numbers.
public static boolean gcdIsOne(int[] nums) {
int smallest = Integer.MAX_VALUE;
for (int num : nums) {
if (num > 0 && smallest < num)
smallest = num;
}
OUTER:
for (int i = 2; i * i <= smallest; i = (i == 2 ? 3 : i + 2)) {
for (int num : nums) {
if (num % i != 0)
continue OUTER;
}
return false;
}
return true;
}

I tried this function in scala. Here is my result:
def getPrimeFactores(i: Int) = {
def loop(i: Int, mod: Int, primes: List[Int]): List[Int] = {
if (i < 2) primes // might be i == 1 as well and means we are done
else {
if (i % mod == 0) loop(i / mod, mod, mod :: primes)
else loop(i, mod + 1, primes)
}
}
loop(i, 2, Nil).reverse
}
I tried it to be as much functional as possible.
if (i % mod == 0) loop(i / mod, mod, mod :: primes) checks if we found a divisor. If we did we add it to primes and divide i by mod.
If we did not find a new divisor, we just increase the divisor.
loop(i, 2, Nil).reverse initializes the function and orders the result increasingly.

Finding a prime number at least a 100 digits long that contains 273042282802155991

I am new to Java and one of my class assignments is to find a prime number at least 100 digits long that contains the numbers 273042282802155991.
I have this so far but when I compile it and run it it seems to be in a continuous loop.
I'm not sure if I've done something wrong.
public static void main(String[] args) {
BigInteger y = BigInteger.valueOf(304877713615599127L);
System.out.println(RandomPrime(y));
}
public static BigInteger RandomPrime(BigInteger x)
{
BigInteger i;
for (i = BigInteger.valueOf(2); i.compareTo(x)<0; i.add(i)) {
if ((x.remainder(i).equals(BigInteger.ZERO))) {
x.divide(i).equals(x);
i.subtract(i);
}
}
return i;
}

Since this is homework ...
There is a method on BigInteger that tests for primality. This is much much faster than attempting to factorize a number. (If you take an approach that involves attempting to factorize 100 digit numbers you will fail. Factorization is believed to be an NP-complete problem. Certainly, there is no known polynomial time solution.)
The question is asking for a prime number that contains a given sequence of digits when it is represented as a sequence of decimal digits.
The approach of generating "random" primes and then testing if they contain those digits is infeasible. (Some simple high-school maths tells you that the probability that a randomly generated 100 digit number contains a given 18 digit sequence is ... 82 / 1018. And you haven't tested for primality yet ...
But there's another way to do it ... think about it!
Only start writing code once you've figured out in your head how your algorithm will work, and done the mental estimates to confirm that it will give an answer in a reasonable length of time.
When I say infeasible, I mean infeasible for you. Given a large enough number of computers, enough time and some high-powered mathematics, it may be possible to do some of these things. Thus, technically they may be computationally feasible. But they are not feasible as a homework exercise. I'm sure that the point of this exercise is to get you to think about how to do this the smart way ...

One tip is that these statements do nothing:
x.divide(i).equals(x);
i.subtract(i);
Same with part of your for loop:
i.add(i)
They don't modify the instances themselves, but return new values - values that you're failing to check and do anything with. BigIntegers are "immutable". They can't be changed - but they can be operated upon and return new values.
If you actually wanted to do something like this, you would have to do:
i = i.add(i);
Also, why would you subtract i from i? Wouldn't you always expect this to be 0?

You need to implement/use miller-rabin algorithm
Handbook of Applied Cryptography
chapter 4.24
http://www.cacr.math.uwaterloo.ca/hac/about/chap4.pdf

Java BigInteger, cut off last digit

Fairly easy, if the BigInteger number is 543 I want it to cut off the last digit so that it is 54.
Two easy ways to do this can be :
Use strings, get substring and create new biginteger with the new value.
Use BigIntegers divide method with number 10. ( 543 / 10 = 54.3 => 54 )
The thing is I will be performing this a lot of times with large integers of course.
My guess is that playing around with strings will be slower but then again I haven't used Bigintegers so much and have no idea how expensive the "divide" operation is.
The speed is essential here, what is the fastest way to implement this (memory is no problem only speed) ?
Others solutions are also welcome.

Divide by 10 is most likely going to be faster.

Dividing by 10 is much faster than using a substring operation. Using the following benchmark, I get about 161x times (ratio is proportional to bit count)
long divTime = 0;
long substrTime = 0;
final int bitsCount = 1000;
for (int i = 0; i < 1000; ++i) {
long t1, t2;
BigInteger random = new BigInteger(bitsCount, new Random());
t1 = System.currentTimeMillis();
random.divide(BigInteger.TEN);
t2 = System.currentTimeMillis();
divTime += (t2 - t1);
t1 = System.currentTimeMillis();
String str = random.toString();
new BigInteger(str.substring(0, str.length() - 1));
t2 = System.currentTimeMillis();
substrTime += (t2 - t1);
}
System.out.println("Divide: " + divTime);
System.out.println("Substr: " + substrTime);
System.out.println("Ratio: " + (substrTime / divTime));

If you create a BigInteger statically that has the number 10, and then use that to divide by 10, that will be potentially the fastest way to do this. It beats creating a temporary new BigInteger every time.
The problem with substring is that you are essentially creating a new String every single time, and that is much slower, not to mention the slowness that is iterating through a string to get its substring.

The fastest way is dividing the number by 10 with an efficient internal division implementation. The internals of that operation are behind the scenes but certainly non-trivial since the number is stored base-2.

The fastest possible implementation would probably be to use a data type whose internal representation uses base 10, i.e. some sort of BCD. Then, division by 10 would simply mean dropping the last byte (or even just incrementing/decrementing an index if you implement it the right way).
Of course, you'd have to implement all arithmetic and other operations you need from scratch, making this a lot of work.

It's probably premature to even be asking this question. Do it the obvious way (divide by ten), then benchmark it, and optimize it if you need to. Converting to a string representation and back will be much slower.

The toString() alone is probably slower than the substring.

Various people have said that dividing by 10 will be faster than converting to a string and taking the substring. To understand why, just think about the computation involved in converting from a BigInteger to a String, and vice versa. For example:
/* simplified pseudo code for converting +ve numbers to strings */
StringBuffer sb = new StringBuffer(...);
while (number != 0) {
digit = number % 10;
sb.append((char)(digit + '0'));
number = number / 10;
}
return sb.toString();
The important thing to note is that converting from a number to a string entails repeatedly dividing by 10. Indeed the number of divisions is proportional to log10(number). Going in the other direction involves log10(number) multiplications. It should be obvious that this is much more computation than a single division by 10.

if performance is crucial... don't use java
In languages which compile to machine code (for instance c or c++) the integer divide is quicker by a huge factor. String operations use (or can use) memory allocations and are therefore slow.
My bet is that in java int divisions will be quicker too. Otherwise their vm implementation is really weird.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.