I want to generate a 160-bit prime number in java. I know that I'll have to loop through all the 160-bit numbers and for any number n, I'll have to check if they are divisible by any primes less than sqroot(n) or by any primality test like Miller-Rabin test. My questions are:
Is there any specific library which does this?
Is there any other (better) way to do this?
BigInteger.probablePrime(160, new Random()) generates a BigInteger that is almost certainly prime -- the probability that it is not a prime is less than the probability that you will get struck by lightning. In general, BigInteger already has heavily tested and optimized primality testing operations built in.
For what it's worth, the reason this won't take forever is that by the prime number theorem, a randomly chosen n-bit number has probability proportional to 1/n of being prime, so on average you only need to try O(n) different random n-bit numbers before you'll find one that's prime.
Related
So I've implemented my own little RSA algorithm and in the course of that I wrote a function to find large prime numbers.
First I wrote a function prime? that tests for primality and then I wrote two versions of a prime searching function. In the first version I just test random BigIntegers until I hit a prime. In the second version I sample a random BigInteger and then incremented it until I find a prime.
(defn resampling []
(let [rnd (Random.)]
(->> (repeatedly #(BigInteger. 512 rnd))
(take-while (comp not prime?))
(count))))
(defn incrementing []
(->> (BigInteger. 512 (Random.))
(iterate inc)
(take-while (comp not prime?))
(count)))
(let [n 100]
{:resampling (/ (reduce + (repeatedly n resampling)) n)
:incrementing (/ (reduce + (repeatedly n incrementing)) n)})
Running this code yielded the two averages of 332.41 for the resampling function and 310.74 for the incrementing function.
Now the first number makes complete sense to me. The prime number theorem states that the n'th prime is about n*ln(n) in size (where ln is the natural logarithm). So the distance between adjacent primes is approximately n*ln(n) - (n-1)*ln(n-1) ≈ (n - (n - 1))*ln(n) = ln(n) (For large values of n ln(n) ≈ ln(n - 1)). Since I'm sampling 512-bit integers I'd expect the distance between primes to be in the vicinity of ln(2^512) = 354.89. Therefore random sampling should take about 354.89 attempts on average before hitting a prime, which comes out quite nicely.
The puzzle for me is why the incrementing function is taking about just as many steps. If I imagine throwing a dart at a grid where primes are spaced 355 units apart, it should take only about half that many steps on average to walk to the next higher prime, since on average I'd be hitting the center between two primes.
(The code for prime? is a little lengthy. You can take a look at it here.)
You assume that primes are equally distributed, that seems not to be the case.
Let's consider the following possible scenario: If primes would always come as pairs for example 10...01 and 10...03 then the next pair would come in 2*ln(n). For the sampling algorithm this distribution makes no difference, but for the incrementing algorithm the probability to start inside of a such pair is almost 0, so this means it would need to go a half of the big distance on average, that is ln(n).
In a nutshell: to estimate the behavior of the incremental algorithm right, it is not enough to know the average distance between the primes.
Take the following problem: "how many numbers are in a given range of integers, from which both their sum of digits and the sum of its square are prime?"
I was watching around codereview, and here I found an interesting question and tried to solve it.
So one can check prime numbers in a ordinary fashion, i.e. using a for loop from 2 to i and check for divisibility.
The interesting thing is here. BlueRaja - Danny Pflughoeft suggests a trick: "Since you only need to sieve to the square root of the number you're testing for primality, you only need to run your sieve from 3 to*sqrt(⌈log10(B)⌉*81)".
I have a question regarding implementation of Sieve of Eratosthenes.what is the size of boolean array, which contains numbers to process for sieve.? can somebody write a code or any hint?
Here's an example of the implementation of the Sieve of Eratosthenes using Java: link.
For the second part of your questions see this link:
"The maximum sum of squares-of-digits of an n-digit number is n*9*9 = n*81. The number of digits in a number B is ⌈log10(B)⌉. Since you only need to sieve to the square root of the number you're testing for primality, you only need to run your sieve from 3 to sqrt(⌈log10(B)⌉*81). Even for B = 1 billion, this means the max you need to sieve to is 28."
I'm running into problems with my Sieve of Eratosthenes. I wanted to write a Sieve that didn't require an array of all numbers up to the largest prime you want, instead just keeping track of each prime multiple as the Sieve reaches it. That means you don't have to do all the work up front, but can just determine the next prime when you need it. It would also be easy to add interface features like "find K primes starting at N". Here is the pseudocode:
Begin with current number set to 2
Loop:
If prime queue is not empty:
Peek at the top prime in the queue
If current > top, we can move top to the next multiple
Remove the top prime from the prime queue
Increment top to its next multiple
Re-add it to the queue
If current == top, current is not a prime
Increment current number to next integer
If current < top, we've found a prime
Break
Push current number onto prime queue
Increment current number to next integer
Return the new prime
So here's the problem: I correctly calculate the first 31 primes (up to 127), but after that it thinks every number is prime. I've put my code on Ideone -- I'm hoping it's some Java collections behavior, or a trivial bug, rather than the algorithm itself. I can't think of a reason the algorithm should break after a certain number of primes. I've confirmed manually that after 127, if the heap is properly ordered, my algorithm should recognize 128 as not a prime, but that's not what the code shows me.
Any suggestions?
http://ideone.com/E07Te
(I will, of course, increment by 2 (to skip all non-prime even numbers) once I get the basic algorithm working. I'll probably also make the Sieve an iterable.)
Your problem is
top.multiple == current
in connection with
Integer current = 2;
Integer multiple;
There is a cache of Integers with small absolute value, -128 to 127, if I recall correctly, so the comparison using == compares identical instances for values smaller than 128. But from 128 on, you get a new boxed Integer for current, and that is a different object than the one referenced by top.multiple.
Compare using equals or declare int current; to solve it.
And improve your algorithm, note multiples of each prime only from the prime's square.
You're not checking your whole list:
Sieve heap after 31:
[[127:127], [11:132], [2:128]
You get to 132, which is > 128, and thus hit the break; before you check for 2*64.
I'm aware of the function BigInteger.probablePrime(int bitLength, Random rnd) that outputs probably prime number of any bit length. I want a REAL prime number in Java. Is there any FOSS library to do so with acceptable performance? Thanks in advance!
EDIT:
I'm looking at 1024 & 2048 bit primes.
use probable prime to generate a candidate
use a fast deterministic test such as the AKS primality test to check whether the candidate is indeed prime.
edit: Or, if you don't trust the isProbablePrime to be large enough certainty, use the BigInteger constructor BigInteger(int bitLength, int certainty, Random rnd) that lets you tune your certainty threshold:
certainty - a measure of the uncertainty that the caller is willing to tolerate. The probability that the new BigInteger represents a prime number will exceed (1 - 1/2certainty). The execution time of this constructor is proportional to the value of this parameter.
Probabilistic tests used for cryptographic purposes are guaranteed to bound the probability of false positives -- it's not like there's some gotcha numbers that exist that will sneak through, it's just a matter of how low you want the probability to be. If you don't trust the Java BigInteger class to use these (it would be nice if they documented what test was used), use the Rabin-Miller test.
There are some methods to generate very large primes with acceptable performance, but not with sufficient density for most purposes other than getting into the Guiness Book of Records.
Look at it like this: the likelihood that a number returned by probablePrime() is not prime is lower than the likelihood of you and everyone you know getting hit by lighting. Twice. On a single day.
Just don't worry about it.
You could also use the constructor of BigInteger to generate a real prime:
BigInteger(int bitLength, int certainty, Random rnd)
The time to execute is proportional to the certainty, but on my Core i7 it isn't a problem.
Make a method and wrap it.
BigInteger definitePrime(int bits, Random rnd) {
BigInteger prime = new BigInteger("4");
while(!isPrime(prime)) prime = BigInteger.probablePrime(bits,rnd);
return prime;
}
Random rnd = new SecureRandom();
System.out.println(BigInteger.probablePrime(bitLength, rnd));
The probability that a BigInteger returned by method probablePrime() is composite does not exceed 2^-100.
I understand that multiplication by a large number before xoring should help with badly distributed operands but why should the multiplier be a prime?
Related:
Why should hash functions use a prime number modulus?
Close, but not quite a Duplicate:
Why does Java’s hashCode() in String use 31 as a multiplier?
There's a good article on the Computing Life blog that discusses this topic in detail. It was originally posted as a response to the Java hashCode() question I linked to in the question. According to the article:
Primes are unique numbers. They are unique in that, the product of a prime with any other number has the best chance of being unique (not as unique as the prime itself of-course) due to the fact that a prime is used to compose it. This property is used in hashing functions.
Given a string “Samuel”, you can generate a unique hash by multiply each of the constituent digits or letters with a prime number and adding them up. This is why primes are used.
However using primes is an old technique. The key here to understand that as long as you can generate a sufficiently unique key you can move to other hashing techniques too. Go here for more on this topic about hashes without primes.
Multiplying by a non-prime has a cyclic repeating pattern much smaller than the number. If you use a prime then the cyclic repeating pattern is guaranteeed to be at least as large as the prime number.
I'm not sure exactly which algorithm you're talking about, but typically the constants in such algorithms need to be relatively prime. Otherwise, you get cycles and not all the possible values show up in the result.
The number probably doesn't need to be prime in your case, only relatively prime to some other numbers, but making it prime guarantees that. It also covers the cases where the other magic numbers change.
For example, if you are talking about taking the last bits of some number, then the multiplier needs to not be a multiple of 2. So, 9 would work even though it's not prime.
Consider the simplest multiplication: x2.
It is equivalent to a left-bitshift. In other words, it really didn't "randomize" the data, it just shifted it over.
Same with x4, or any power of two. The original data is intact, just shifted.
Now, multiplication by other numbers (non-powers of two) are not as obvious, but still have the same problem, more or less. The original data is intact, or trivially transformed. (eg. x5 is the same as left-bitshift two places, then add on the original data).
The point of GetHashCode is to essentially distribute the data as randomly as possible. Multiplying by a prime number guarantees that the answer won't be a simpler transform like bit-shifting or adding a number to itself.