We have a test exercise where you need to find out whether a given N number is a square of another number or no, with the smallest time complexity.
I wrote:
public static boolean what2(int n) {
double newN = (double)n;
double x = Math.sqrt(newN);
int y = (int)x;
if (y * y == n)
return false;
else
return true;
}
I looked online and specifically on SO to try and find the complexity of sqrt but couldn't find it. This SO post is for C# and says its O(1), and this Java post says its O(1) but could potentially iterate over all doubles.
I'm trying to understand the worst time complexity of this method. All other operations are O(1) so this is the only factor.
Would appreciate any feedback!
Using the floating point conversion is OK because java's int type is 32 bits and java's double type is the IEEE 64 bit format that can represent all values of 32 bit integers exactly.
If you were to implement your function for long, you would need to be more careful because many large long values are not represented exactly as doubles, so taking the square root and converting it to an integer type might not yield the actual square root.
All operations in your implementation execute in constant time, so the complexity of your solution is indeed O(1).
If I understood the question correctly, the Java instruction can be converted by just-in-time-compilation to use the native fsqrt instruction (however I don't know whether this is actually the case), which, according to this table, uses a bounded number of processor cycles, which means that the complexity would be O(1).
java's Math.sqrt actually delegates sqrt to StrictMath.java source code one of its implementations can be found here, by looking at sqrt function, it looks like the complexity is constant time. Look at while(r != 0) loop inside.
Related
In the program I am writing, I have used the following method to check whether a number is a perfect square or not.
// Checks whether x is a perfect square
public static boolean issqr(BigInteger x){
a=x.sqrt();
return x.equals(a.multiply(a));
}
In the above code, the following methods from the BigInteger class are used :-
BigInteger multiply(BigInteger num) : Returns the product of this and num.
boolean equals(object obj) : Checks for equality between this and obj.
BigInteger sqrt() : Returns the integral part of the square root of this.
I believe that the sqrt() method in Java uses Newton's method, which would model a binary search algorithm. The issqr(BigInteger x) method above must have the same complexity as the sqrt() method in BigInteger class. However, on comparing the run times for different values of x in the issqr(BigInteger x) method, it looks as though the run time is growing exponentially instead.
What is the reason for a binary search algorithm to have exponential run time complexity? Does it have anything to do with memory and the immutability of BigInteger datatype? Is there a more efficient algorithm to check if a number is a perfect square? Thank you in advance.
TL;DR - it is complicated!
According to Emil Jeřábek in https://cstheory.stackexchange.com/a/9709
The square root of an N-digit number can be computed in time O(M(n)) using e.g. Newton’s iteration, where M(N) is the time needed to multiply two N-digit integers. The current best bound on M(n) is N logN 2^O(logN) using Fürer’s algorithm.
So the theoretical complexity of the complete check would be O(M(N)) + O(M(N/2)) which reduces to O(M(N)).
In practice, we need to look at how BigInteger is implemented. According to comments in the Java 11 source code
"The implementation [of MutableBigInteger.sqrt()] is based on the material in Henry S. Warren, Jr., Hacker's Delight (2nd ed.) (Addison Wesley, 2013), 279-282.
According to the source code, Java 11 BigInteger.multiply(BigInteger) implementation uses:
a naive "grade school" algorithm for small numbers,
the Karatsuba algorithm, for intermediate numbers, or
an "optimal" 3-way Toom-Cook algorithm for really large numbers.
The latter is described in Towards Optimal Toom-Cook Multiplication for Univariate and Multivariate Polynomials in Characteristic 2 and 0. by Marco BODRATO; In C.Carlet and B.Sunar, Eds., "WAIFI'07 proceedings".
I don't have access to the references to check what they say about the complexity of 3-way Toom-Cook or Warren's algorithm respectively. However, Wikipedia says that Karatsuba multiplication for N-digit numbers has an asymptotic bound of Θ(N**log2(3)).
Based on that, we can say that checking if an N-digit number is a perfect square using BigInteger is likely to be O(N**log2(3)) == O(N**~1.585) or better.
I want to efficiently calculate ((X+Y)!/(X!Y!))% P (P is like 10^9+7)
This discussion gives some insights on distributing modulo over division.
My concern is it's not necessary that a modular inverse always exists for a number.
Basically, I am looking for a code implementation of solving the problem.
For multiplication it is very straightforward:
public static int mod_mul(int Z,int X,int Y,int P)
{
// Z=(X+Y) the factorial we need to calculate, P is the prime
long result = 1;
while(Z>1)
{
result = (result*Z)%P
Z--;
}
return result;
}
I also realize that many factors can get cancelled in the division (before taking modulus), but if the number of divisors increase, then I'm finding it difficult to efficiently come up with an algorithm to divide. ( Looping over List(factors(X)+factors(Y)...) to see which divides current multiplying factor of numerator).
Edit: I don't want to use BigInt solutions.
Is there any java/python based solution or any standard algorithm/library for cancellation of factors( if inverse option is not full-proof) or approaching this type of problem.
((X+Y)!/(X!Y!)) is a low-level way of spelling a binomial coefficient ((X+Y)-choose-X). And while you didn't say so in your question, a comment in your code implies that P is prime. Put those two together, and Lucas's theorem applies directly: http://en.wikipedia.org/wiki/Lucas%27_theorem.
That gives a very simple algorithm based on the base-P representations of X+Y and X. Whether BigInts are required is impossible to guess because you didn't give any bounds on your arguments, beyond that they're ints. Note that your sample mod_mul code may not work at all if, e.g., P is greater than the square root of the maximum int (because result * Z may overflow then).
It's binomial coefficients - C(x+y,x).
You can calculate it differently C(n,m)=C(n-1,m)+C(n-1,m-1).
If you are OK with time complexity O(x*y), the code will be much simpler.
http://en.wikipedia.org/wiki/Combination
for what you need here is a way to do it efficiently : -
C(n,k) = C(n-1,k) + C(n-1,k-1)
Use dynamic programming to calculate efficient in bottom up approach
C(n,k)%P = ((C(n-1,k))%P + (C(n-1,k-1))%P)%P
Therefore F(n,k) = (F(n-1,k)+F(n-1,k-1))%P
Another faster approach : -
C(n,k) = C(n-1,k-1)*n/k
F(n,k) = ((F(n-1,k-1)*n)%P*inv(k)%P)%P
inv(k)%P means modular inverse of k.
Note:- Try to evaluate C(n,n-k) if (n-k<k) because nC(n-k) = nCk
I have given run-time functions for two algorithms solving the same problem. Let's say -
For First algorithm : T(n) = an + b (Linear in n)
For second Algorithm: T(n) = xn^2 + yn + z (Quadratic in n)
Every book says linear in time is better than quadratic and of course it is for bigger n (how big?). I feel definition of Big changes based on the constants a, b, x, y and z.
Could you please let me know how to find the threshold for n when we should switch to algo1 from algo2 and vice-versa (is it found only through experiments?). I would be grateful if someone can explain how it is done in professional software development organizations.
I hope I am able to explain my question if not please let me know.
Thanks in advance for your help.
P.S. - The implementation would be in Java and expected to run on various platforms. I find it extremely hard to estimate the constants a, b, x, y and z mathematically. How do we solve this dilemma in professional software development?
I would always use the O(n) one, for smaller n it might be slower, but n is small anyway. The added complexity in your code will make it harder to debug and maintain if it's trying to choose the optimal algorithm for each dataset.
It is impossible to estimate the fixed factors in all cases of practical interest. Even if you could, it would not help unless you could also predict how the size of the input is going to evolve in the future.
The linear algorithm should always be preferred unless other factors come into play as well (e.g. memory consumption). If the practical performance is not acceptable you can then look for alternatives.
Experiment. I also encountered a situation in which we had code to find a particular instance in a list of instances. The original code did a simple loop, which worked well for several years.
Once, one of our customers logged a performance problem. In his case the list contained several thousands of instances and the lookup was really slow.
The solution of my fellow developer was to add hashing to the list, which indeed solved the customer's problem. However, now other customers started to complain because they suddenly had a performance problem. It seemed that in most cases, the list only contained a few (around 10) entries, and the hashing was much slower than just looping over the list.
The final solution was to measure the time of both alternatives (looping vs. hashing) and determining the point at which the looping become slower than hashing. In our case this was about 70. So we changed the algorithm:
If the list contains less than 70 items we loop
If the list contains more then 70 items we hash
The solution will probably be similar in your case.
You are asking a maths question, not a programming one.
NB I am going to assume x is positive...
You need to know when
an+b < xn^2 + yn + z
ie
0 < xn^2 + (y-a)n + (z-b)
You can plug this into the standard equation for solving quadratics http://en.wikipedia.org/wiki/Quadratic_equation#Quadratic_formula
And take the larger 0, and then you know for all values greater than this (as x positive) O(n^2) is greater.
You end up with a horrible equation involving x, y, a, z, and b that I very much doubt is any use to you.
Just profile the code with the expected inputs size, it's even better if you also add in a worst case input. Don't waste your time solving the equation, which might be impossible to derive in the first place.
Generally, you can expect O(n2) to be significantly slower than O(n) from size of n = 10000. Significantly slower means that any human can notice it is slower. Depending on the complexity of the algorithm, you might notice the difference at smaller n.
The point is: judging an algorithm based on time complexity allows us to ignore some algorithms that is clearly too slow for any input at the largest input size. However, depending on the domain of the input data, certain algorithm with higher complexity will practically outperform other algorithm with lower time complexity.
When we write an algorithm for a large scale purpose, we want it to perform good for large 'n'. In your case, depending upon a, b, x, y and z, the second algorithm may perform better though its quadratic. But no matter what the values of a, b, x, y and z are, there would be some lower limit of n (say n0) beyond which first algo (linear one) will always be faster than the second.
If f(n) = O(g(n))
then it means for some value of n >= n0 (constant)
f(n) <= c1*g(n)
So
if g(n) = n,
then f(n) = O(n)
So choose the algo depending upon you usage of n
Consider the case where you want to test every possible input value. Creating a case where you can iterate over all the possible ints is fairly easy, as you can just increment the value by 1 and repeat.
How would you go about doing this same idea for all the possible double values?
You can iterate over all possible long values and then use Double.longBitsToDouble() to get a double for each possible 64-bit combination.
Note however that this will take a while. If you require 100 nanoseconds of processing for each double value it will take roughly (not all bit combinations are different double numbers, e.g. NaN) 2^64*1e-7/86400/365 years which is more than 16e11/86400/365 = 50700 years on a single CPU. Unless you have a datacenter to do the computation, it is a better idea to go over possible range of all input values sampling the interval at a configurable number of points.
Analogous feat for float is still difficult but doable: assuming you need 10 milliseconds of processing for each input value you need roughly 2^32*1e-2/86400 = 497.1 days on a single CPU. You would use Float.intBitsToFloat() in this case.
Java's Double class lets you construct and take apart Double values into its constituent pieces. This, and an understanding of double representation, will allow you at least conceptually to enumerate all possible doubles. You will likely find that there are too many though.
do a loop like:
for (double v = Double.MIN_VALUE; v <= Double.MAX_VALUE; v = Math.nextUp(v)) {
// ...
}
but as already explained in Adam's answer, it will take long to run.
(this will neither create NaN nor Infinity)
This question already has answers here:
Computational complexity of Fibonacci Sequence
(12 answers)
Closed 6 years ago.
So, i've got a recursive method in Java for getting the 'n'th fibonacci number - The only question i have, is: what's the time complexity? I think it's O(2^n), but i may be mistaken? (I know that iterative is way better, but it's an exercise)
public int fibonacciRecursive(int n)
{
if(n == 1 || n == 2) return 1;
else return fibonacciRecursive(n-2) + fibonacciRecursive(n-1);
}
Your recursive code has exponential runtime. But I don't think the base is 2, but probably the golden ratio (about 1.62). But of course O(1.62^n) is automatically O(2^n) too.
The runtime can be calculated recursively:
t(1)=1
t(2)=1
t(n)=t(n-1)+t(n-2)+1
This is very similar to the recursive definition of the fibonacci numbers themselves. The +1 in the recursive equation is probably irrelevant for large n. S I believe that it grows approximately as fast as the fibo numbers, and those grow exponentially with the golden ratio as base.
You can speed it up using memoization, i.e. caching already calculated results. Then it has O(n) runtime just like the iterative version.
Your iterative code has a runtime of O(n)
You have a simple loop with O(n) steps and constant time for each iteration.
You can use this
to calculate Fn in O(log n)
Each function call does exactly one addition, or returns 1. The base cases only return the value one, so the total number of additions is fib(n)-1. The total number of function calls is therefore 2*fib(n)-1, so the time complexity is Θ(fib(N)) = Θ(phi^N), which is bounded by O(2^N).
O(2^n)? I see only O(n) here.
I wonder why you'd continue to calculate and re-calculate these? Wouldn't caching the ones you have be a good idea, as long as the memory requirements didn't become too odious?
Since they aren't changing, I'd generate a table and do lookups if speed mattered to me.
It's easy to see (and to prove by induction) that the total number of calls to fibonacciRecursive is exactly equal to the final value returned. That is indeed exponential in the input number.