Runtime of BigInteger operations? [duplicate] - java

What complexity are the methods multiply, divide and pow in BigInteger currently? There is no mention of the computational complexity in the documentation (nor anywhere else).

If you look at the code for BigInteger (provided with JDK), it appears to me that
multiply(..) has O(n^2) (actually the method is multiplyToLen(..)). The code for the other methods is a bit more complex, but you can see yourself.
Note: this is for Java 6. I assume it won't differ in Java 7.

As noted in the comments on #Bozho's answer, Java 8 and onwards use more efficient algorithms to implement multiplication and division than the naive O(N^2) algorithms in Java 7 and earlier.
Java 8 multiplication adaptively uses either the naive O(N^2) long multiplication algorithm, the Karatsuba algorithm or the 3 way Toom-Cook algorithm depending in the sizes of the numbers being multiplied. The latter are (respectively) O(N^1.58) and O(N^1.46).
Java 8 division adaptively uses either Knuth's O(N^2) long division algorithm or the Burnikel-Ziegler algorithm. (According to the research paper, the latter is 2K(N) + O(NlogN) for a division of a 2N digit number by an N digit number, where K(N) is the Karatsuba multiplication time for two N-digit numbers.)
Likewise some other operations have been optimized.
There is no mention of the computational complexity in the documentation (nor anywhere else).
Some details of the complexity are mentioned in the Java 8 source code. The reason that the javadocs do not mention complexity is that it is implementation specific, both in theory and in practice. (As illustrated by the fact that the complexity of some operations is significantly different between Java 7 and 8.)

There is a new "better" BigInteger class that is not being used by the sun jdk for conservateism and lack of useful regression tests (huge data sets). The guy that did the better algorithms might have discussed the old BigInteger in the comments.
Here you go http://futureboy.us/temp/BigInteger.java

Measure it. Do operations with linearly increasing operands and draw the times on a diagram.
Don't forget to warm up the JVM (several runs) to get valid benchmark results.
If operations are linear O(n), quadratic O(n^2), polynomial or exponential should be obvious.
EDIT: While you can give algorithms theoretical bounds, they may not be such useful in practice. First of all, the complexity does not give the factor. Some linear or subquadratic algorithms are simply not useful because they are eating so much time and resources that they are not adequate for the problem on hand (e.g. Coppersmith-Winograd matrix multiplication).
Then your computation may have all kludges you can only detect by experiment. There are preparing algorithms which do nothing to solve the problem but to speed up the real solver (matrix conditioning). There are suboptimal implementations. With longer lengths, your speed may drop dramatically (cache missing, memory moving etc.). So for practical purposes, I advise to do experimentation.
The best thing is to double each time the length of the input and compare the times.
And yes, you do find out if an algorithm has n^1.5 or n^1.8 complexity. Simply quadruple
the input length and you need only the half time for 1.5 instead of 2. You get again nearly half the time for 1.8 if you multiply the length 256 times.

Related

Which algorithm does Java use for multiplication?

The list of possible algorithms for multiplication is quite long:
Schoolbook long multiplication
Karatsuba algorithm
3-way Toom–Cook multiplication
k-way Toom–Cook multiplication
Mixed-level Toom–Cook
Schönhage–Strassen algorithm
Fürer's algorithm
Which one is used by Java by default and why? When does it switch to a "better performance" algorithm?
Well ... the * operator will use whatever the hardware provides. Java has no say in it.
But if you are talking about BigInteger.multiply(BigInteger), the answer depends on the Java version. For Java 11 it uses:
naive "long multiplication" for small numbers,
Karatsuba algorithm for medium sized number, and
3-way Toom–Cook multiplication for large numbers.
The thresholds are Karatsuba for numbers represented by 80 to 239 int values, an 3-way Toom-Cook for >= 240 int values. The smaller of the numbers being multiplied controls the algorithm selection.
Which one is used by Java by default and why?
Which ones? See above.
Why? Comments in the code imply that the thresholds were chosen empirically; i.e. someone did some systematic testing to determine which threshold values gave the best performance1.
You can find more details by reading the source code2.
1 - The current implementation BigInteger implementation hasn't changed significantly since 2013, so it is possible that it doesn't incorporate more recent research results.
2 - Note that this link is to the latest version on Github.

Big-O notation of a function in Java by experiment

I need to write a program that determines the Big-O notation of an algorithm in Java.
I don't have access to the algorithms code, so it must be based on experimentation and execution times. I don't know where to start.
Can someone help me?
Edit: The only thing i know about the algorithm is that takes an integer value and doesn't have a return
Firstly, you need to be aware that what such a program does it to provided an evidence-based guess as to what complexity class the algorithm belongs to. It can give the wrong answer. (Indeed, in complicated cases where the complexity class is unusual, wrong answers are increasingly likely.)
In short, this is NOT complexity analysis.
The general approach would be:
Run the algorithm multiple times with values of N across the range, measuring the execution times. Repeat multiple times for each N, to ensure that you are getting consistent measurements.
Try to fit the experimental results to different kinds of curves; i.e. linear, quadratic, logarithmic. Note that it is the fit for large values of N that matters. So when you check for "goodness of fit", use a measure that gives increasing weight to the larger data points.
This is intended as a start point. For example, I'm expecting that you will do your own research on issues such as:
how to get reliable execution-time measurements (for Java),
how to do curve fitting in a mathematically sound way, and
dealing with the case where the execution times get too long to measure experimentally for large N.
You could do some experiments and graph the amount of input vs the time spent executing the function. Then you could compare it to the well known curves associated with Big-O or try to estimate the equation.
Since you don't have access to the algorithm's source code, the only thing you can do is to measure how long the algorithm takes for inputs of different size, and then try to extrapolate a function from that. Since you are doing experiments, you now enter the field of statistics, so maybe you can use ideas from that area, such as regression analysis.

Big O for a finite, fixed size set of possible values

This question triggered some confusion and many comments about whether the algorithms proposed in the various answers are O(1) or O(n).
Let's use a simple example to illustrate the two points of view:
we want to find a long x such that a * x + b = 0, where a and b are known, non-null longs.
An obvious O(1) algo is x = - b / a
A much slower algo would consist in testing every possible long value, which would be about 2^63 times slower on average.
Is the second algorithm O(1) or O(n)?
The arguments presented in the linked questions are:
it is O(n) because in the worst case you need to loop over all possible long values
it is O(1) because its complexity is of the form c x O(1), where c = 2^64 is a constant.
Although I understand the argument to say that it is O(1), it feels counter-intuitive.
ps: I added java as the original question is in Java, but this question is language-agnostic.
The complexity is only relevant if there is a variable N. So, the question makes no sense as is. If the question was:
A much slower algo would consist in testing every possible value in a range of N values, which would be about N times slower on average.
Is the second algorithm O(1) or O(N)?
Then the answer would be: this algorithm is O(N).
Big O describes how an algorithm's performance will scale as the input size n scales. In other words as you run the algorithm across more input data.
In this case the input data is a fixed size so both algorithms are O(1) albeit with different constant factors.
If you took "n" to mean the number of bits in the numbers (i.e. you removed the restriction that it's a 64-bit long), then you could analyze for a given bit size n how do the algorithms scale.
In this scenario, the first would still be O(1) (see Qnan's comment), but the second would now be O(2^n).
I highly recommend watching the early lectures from MIT's "Introduction to Algorithms" course. They are a great explanation of Big O (and Big Omega/Theta) although do assume a good grasp of maths.
Checking every possible input is O(2^N) on the number of bits in the solution. When you make the number of bits constant, then both algorithms are O(1), you know how many solutions you need to check.
Fact: Every algorithm that you actually run on your computer is O(1), because the universe has finite computational power (there are finitely many atoms and finitely many seconds have passed since the Big Bang).
This is true, but not a very useful way to think about things. When we use big-O in practice, we generally assume that the constants involved are small relative to the asymptotic terms, because otherwise only giving the asymptotic term doesn't tell you much about how the algorithm performs. This works great in practice because the constants are usually things like "do I use an array or a hash map" which is at most about a 30x different, and the inputs are 10^6 or 10^9, so the difference between a quadratic and linear algorithm is more important than constant factors. Discussions of big-O that don't respect this convention (like algorithm #2) are pointless.
Whatever the value for a or b are, the worst case is still to check 2^64 or 2^32 or 2^somevalue values. This algorithm complexity is in O(2^k) time where k is the number of bits used to represent a long value, or O(1) time if we consider the values of a and b.

Binary GCD Algorithm vs. Euclid's Algorithm on modern computers

http://en.wikipedia.org/wiki/Binary_GCD_algorithm
This Wikipedia entry has a very dissatisfying implication: the Binary GCD algorithm was at one time as much as 60% more efficient than the standard Euclid Algorithm, but as late as 1998 Knuth concluded that there was only a 15% gain in efficiency on his contemporary computers.
Well another 15 years has passed... how do these two algorithms stack today with advances in hardware?
Does the Binary GCD continue to outperform the Euclidean Algorithm in low-level languages but languish behind due to its complexity in higher level languages like Java? Or is the difference moot in modern computing?
Why do I care you might ask? I just so happen to have to process like 100 billion of these today :) Here's a toast to living in an era of computing (poor Euclid).
The answer is of course "it depends". It depends on hardware, compiler, specific implementation, whatever I forgot. On machines with slow division, binary GCD tends to outperform the Euclidean algorithm. I benchmarked it a couple of years ago on a Pentium4 in C, Java and a few other languages, overall in that benchmark, binary gcd with a 256-element lookup table beat the Euclidean algorithm by a factor of between 1.6 and nearly 3. Euclidean came closer when instead of immediately dividing, first a few rounds of subtraction were performed. I don't remember the figures, but binary still was considerably faster.
If the machine has fast division, things may be different, since the Euclidean algorithm needs fewer operations. If the difference of cost between division and subtraction/shifts is small enough, binary will be slower. Which one is better in your circumstances, you have to find out by benchmarking yourself.

What sort does Java Collections.sort(nodes) use?

I think it is MergeSort, which is O(n log n).
However, the following output disagrees:
-1,0000000099000391,0000000099000427
1,0000000099000427,0000000099000346
5,0000000099000391,0000000099000346
1,0000000099000427,0000000099000345
5,0000000099000391,0000000099000345
1,0000000099000346,0000000099000345
I am sorting a nodelist of 4 nodes by sequence number, and the sort is doing 6 comparisons.
I am puzzled because 6 > (4 log(4)). Can someone explain this to me?
P.S. It is mergesort, but I still don't understand my results.
Thanks for the answers everyone. Thank you Tom for correcting my math.
O(n log n) doesn't mean that the number of comparisons will be equal to or less than n log n, just that the time taken will scale proportionally to n log n. Try doing tests with 8 nodes, or 16 nodes, or 32 nodes, and checking out the timing.
You sorted four nodes, so you didn't get merge sort; sort switched to insertion sort.
In Java, the Arrays.sort() methods use merge sort or a tuned quicksort depending on the datatypes and for implementation efficiency switch to insertion sort when fewer than seven array elements are being sorted. (Wikipedia, emphasis added)
Arrays.sort is used indirectly by the Collections classes.
A recently accepted bug report indicates that the Sun implementation of Java will use Python's timsort in the future: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6804124
(The timsort monograph, linked above, is well worth reading.)
An algorithm A(n) that processes an amount of data n is in O(f(n)), for some function f, if there exist two strictly positive constants C_inf and C_sup such that:
C_inf . f(n) < ExpectedValue(OperationCount(A(n))) < C_sup . f(n)
Two things to note:
The actual constants C could be anything, and do depend on the relative costs of operations (depending on the language, the VM, the architecture, or your actual definition of an operation). On some platforms, for instance, + and * have the same cost, on some other the later is an order of magnitude slower.
The quantity ascribed as "in O(f(n))" is an expected operation count, based on some probably arbitrary model of the data you are dealing with. For instance, if your data is almost completely sorted, a merge-sort algorithm is going to be mostly O(n), not O(n . Log(n)).
I've written some stuff you may be interested in about the Java sort algorithm and taken some performance measurements of Collections.sort(). The algorithm at present is a mergesort with an insertion sort once you get down to a certain size of sublists (N.B. this algorithm is very probably going to change in Java 7).
You should really take the Big O notation as an indication of how the algorithm will scale overall; for a particular sort, the precise time will deviate from the time predicted by this calculation (as you'll see on my graph, the two sort algorithms that are combined each have different performance characteristics, and so the overall time for a sort is a bit more complex).
That said, as a rough guide, for every time you double the number of elements, if you multiply the expected time by 2.2, you won't be far out. (It doesn't make much sense really to do this for very small lists of a few elements, though.)

Categories