Related
I have one Sorted ArrayList A and one unsorted ArrayList B, now I want to merge items of B in A such that A remains sorted.
Now I can think of only two ways to do this.
First one is to sort Arraylist B and then have two index positions one for Arraylist A and other for Araylist B, then we will move index
one by one to insert B list's item in A.
Let us assume size of Arraylist A be n and size of Arraylist B bem.
Order of complexity will be O(m Log(m))(for sorting ArrayList B) + O(n + m).
Second Approach is just have an index on ArrayListaylist B and then use Binary search to place item from Arraylist B to A.
Order of complexity will be O(Log(n) * m).
Now can anybody please tell me which approach should i opt, also if you can think of any other approach better than these two then please mention.
It depends on the relative size of n and m.
When n > m*log(m) the run time of the first algorithm with complexity O(m*Log(m) + max(n,m)) would be dominated by that linear term on n (notice in this scenario max(n,m)=n as n > m*log(m)). In this case the second algorithm with complexity O(log(n) * m) would be better.
The exact practical cutoff point would depend on the constant factor for each algorithm particular implementations, but in principle, the second algorithm becomes better as n gets bigger in relation to m, and eventually becomes the better option. In other words, for every possible value of m there exists a big enough value for n for which the second algorithm is better.
EDIT: THE ABOVE IS PARTLY WRONG
I answered assuming the given complexities for both algorithms, but now I'm not sure the complexity for second one is correct. You propose inserting each number from the unsorted list into the sorted list using binary search, but how exactly would you do this? If you have a linked list you can not do binary search. If you have an array you need to displace part of the array on each insert and that is a linear overhead on each insert. I'm not sure if there is a way to achieve this with a more complex data structure, but you can not do this with either a linked list or an array.
To clarify, if you had two algorithms with those time complexities, then my original answer holds, but your second algorithm doesn't have the O(m log(n)) complexity we assumed.
1st Approach: m * log(n) = O(mlgn)
2nd Approach: m * log(m) + n + m = O(mlgm)
if n <= m {
1st approach
} else {
2nd approach
}
I am came across this question recently but didn't get any idea about solving this. Can you some one help with pseudo code.
Given an array with four integers A, B, C, D, shuffle them in some order. If the integers are unique then there are 24 shuffles. My task is get the best shuffle such that
F(S) = abs(s[0]-s[1]) + abs(s[1]-s[2])+ abs(s[2]-s[3])
is maximum
For example consider this example
A=5, B= 3, C=-1, D =5
s[0]=5, s[1]=-1, s[2]= 5, s[3] =3
will give me maximum sum which is
F[s] =14
The time and space complexity are O(1).
Since your array has a bounded size, any algorithm you use that terminates will have time and space complexity O(1). Therefore, the simple algorithm of "try all permutations and find the best one" will solve the problem in the appropriate time bounds. I don't mean to say that this is by any stretch of the imagination the ideal algorithm, but if all you need is something that works in time/space O(1), then you've got your answer.
Hope this helps!
Algorithm
Consider laying out your points in sorted order:
A B C D
Let x be the distance AB
Let y be the distance BC
Let z be the distance CD
An order which will always give the best score is BDAC with score 2x+3y+2z.
Example
In your example, the sorted points are:
A=-1 B= 3 C=5 D=5
x=4, y=2, z=0
So the best order will be BDAC=3->5->-1->5 with score 14.
Hints towards Proof
You can prove this result be simply considering all permutations of the path between the 4 points, and computing the score in terms of x,y,z.
e.g.
ABCD -> x+y+z
ACBD -> x+3y+z
ADBC -> x+3y+2z
etc.
In any permutation, the score will use x at most twice (because A is on the end so the route can only go to or from A twice). Similarly, z is used at most twice because D is on the end. y can be used at most three times because there are three things being added.
The permutation BDAC uses x twice, z twice, and y three times so can never be beaten.
If array is sorted this solution also works:
F(S)= 2*abs(s[0]-s[3]) + abs(s[1]-s[2])
where s[0]=A, s[1]=B, s[2]=C and s[3]=D.
I have given run-time functions for two algorithms solving the same problem. Let's say -
For First algorithm : T(n) = an + b (Linear in n)
For second Algorithm: T(n) = xn^2 + yn + z (Quadratic in n)
Every book says linear in time is better than quadratic and of course it is for bigger n (how big?). I feel definition of Big changes based on the constants a, b, x, y and z.
Could you please let me know how to find the threshold for n when we should switch to algo1 from algo2 and vice-versa (is it found only through experiments?). I would be grateful if someone can explain how it is done in professional software development organizations.
I hope I am able to explain my question if not please let me know.
Thanks in advance for your help.
P.S. - The implementation would be in Java and expected to run on various platforms. I find it extremely hard to estimate the constants a, b, x, y and z mathematically. How do we solve this dilemma in professional software development?
I would always use the O(n) one, for smaller n it might be slower, but n is small anyway. The added complexity in your code will make it harder to debug and maintain if it's trying to choose the optimal algorithm for each dataset.
It is impossible to estimate the fixed factors in all cases of practical interest. Even if you could, it would not help unless you could also predict how the size of the input is going to evolve in the future.
The linear algorithm should always be preferred unless other factors come into play as well (e.g. memory consumption). If the practical performance is not acceptable you can then look for alternatives.
Experiment. I also encountered a situation in which we had code to find a particular instance in a list of instances. The original code did a simple loop, which worked well for several years.
Once, one of our customers logged a performance problem. In his case the list contained several thousands of instances and the lookup was really slow.
The solution of my fellow developer was to add hashing to the list, which indeed solved the customer's problem. However, now other customers started to complain because they suddenly had a performance problem. It seemed that in most cases, the list only contained a few (around 10) entries, and the hashing was much slower than just looping over the list.
The final solution was to measure the time of both alternatives (looping vs. hashing) and determining the point at which the looping become slower than hashing. In our case this was about 70. So we changed the algorithm:
If the list contains less than 70 items we loop
If the list contains more then 70 items we hash
The solution will probably be similar in your case.
You are asking a maths question, not a programming one.
NB I am going to assume x is positive...
You need to know when
an+b < xn^2 + yn + z
ie
0 < xn^2 + (y-a)n + (z-b)
You can plug this into the standard equation for solving quadratics http://en.wikipedia.org/wiki/Quadratic_equation#Quadratic_formula
And take the larger 0, and then you know for all values greater than this (as x positive) O(n^2) is greater.
You end up with a horrible equation involving x, y, a, z, and b that I very much doubt is any use to you.
Just profile the code with the expected inputs size, it's even better if you also add in a worst case input. Don't waste your time solving the equation, which might be impossible to derive in the first place.
Generally, you can expect O(n2) to be significantly slower than O(n) from size of n = 10000. Significantly slower means that any human can notice it is slower. Depending on the complexity of the algorithm, you might notice the difference at smaller n.
The point is: judging an algorithm based on time complexity allows us to ignore some algorithms that is clearly too slow for any input at the largest input size. However, depending on the domain of the input data, certain algorithm with higher complexity will practically outperform other algorithm with lower time complexity.
When we write an algorithm for a large scale purpose, we want it to perform good for large 'n'. In your case, depending upon a, b, x, y and z, the second algorithm may perform better though its quadratic. But no matter what the values of a, b, x, y and z are, there would be some lower limit of n (say n0) beyond which first algo (linear one) will always be faster than the second.
If f(n) = O(g(n))
then it means for some value of n >= n0 (constant)
f(n) <= c1*g(n)
So
if g(n) = n,
then f(n) = O(n)
So choose the algo depending upon you usage of n
Question: Given a sorted array A find all possible difference of elements from A.
My solution:
for (int i=0; i<n-1; ++i) {
for (int j=i+1; j<n; ++j) {
System.out.println(Math.abs(ai-aj));
}
}
Sure, it's O(n^2), but I don't over count things at all. I looked online and I found this: http://www.careercup.com/question?id=9111881. It says you can't do better, but at an interview I was told you can do O(n). Which is right?
A first thought is that you aren't using the fact that the array is sorted. Let's assume it's in increasing order (decreasing can be handled analogously).
We can also use the fact that the differences telescope (i>j):
a_i - a_j = (a_i - a_(i-1)) + (a_(i-1) - a_(i-2)) + ... + (a_(j+1) - a_j)
Now build a new sequence, call it s, that has the simple difference, meaning (a_i - a_(i-1)). This takes only one pass (O(n)) to do, and you may as well skip over repeats, meaning skip a_i if a_i = a_(i+1).
All possible differences a_i-a_j with i>j are of the form s_i + s_(i+1) + ... + s_(j+1). So maybe if you count that as having found them, then you did it in O(n) time. To print them, however, may take as many as n(n-1)/2 calls, and that's definitely O(n^2).
For example for an array with the elements {21, 22, ..., 2n} there are nā
(n-1)/2 possible differences, and no two of them are equal. So there are O(n2) differences.
Since you have to enumerate all of them, you also need at least O(n2) time.
sorted or unsorted doesn't matter, if you have to calculate each difference there is no way to do it in less then n^2,
the question was asked wrong, or you just do O(n) and then print 42 the other N times :D
You can get another counter-example by assuming the array contents are random integers before sorting. Then the chance that two differences, Ai - Aj vs Ak - Al, or even Ai - Aj vs Aj - Ak, are the same is too small for there to be only O(n) distinct differences Ai - Aj.
Given that, the question to your interviewer is to explain the special circumstances that allow an O(n) solution. One possibility is that the array values are all numbers in the range 0..n, because in this case the maximum absolute difference is only n.
I can do this in O(n lg n) but not O(n). Represent the array contents by an array of size n+1 with element i set to 1 where there is a value i in the array. Then use FFT to convolve the array with itself - there is a difference Ai - Aj = k where the kth element of the convolution is non-zero.
If the interviewer is fond of theoretical games, perhaps he was thinking of using a table of inputs and results? Any problem with a limit on the size of the input, and that has a known solution, can be solved by table lookup. Given that you have first created and stored that table, which might be large.
So if the array size is limited, the problem can be solved by table lookup, which (given some assumptions) can even be done in constant time. Granted, even for a maximum array size of two (assuming 32-bit integers) the table will not fit in a normal computer's memory, or on the disks. For larger max sizes of the array, you're into "won't fit in the known universe" size. But, theoretically, it can be done.
(But in reality, I think that Jens Gustedt's comment is more likely.)
Yes you can surely do that its a little tricky method.
to find differances in O(n) you will need to use BitSet(C++) or any similar Data Structure in respective language.
Initialize two bitset say A and B
You can do as follows:
For each iteration through array:
1--store consecutive differance in BitSet A
2--LeftShift B
3--store consecutive differance in BitSet B
4--take A=A or B
for example I have given code-
Here N is Size of array
for (int i=1;i<N;i++){
int diff = arr[i]-arr[i-1];
A[diff]=1;
B<<=diff;
B[diff]=1;
A=A | B;
}
Bits in A which are 1 will be the differances.
First of all the array need to be sorted
lets think a sorted array ar = {1,2,3,4}
so what we were doing at the O(n^2)
for(int i=0; i<n; i++)
for(int j=i+1; j<n; j++) sum+=abs(ar[i]-ar[j]);
if we do the operations here elaborately then it will look like below
when i = 0 | sum = sum + {(2-1)+(3-1)+(4-1)}
when i = 1 | sum = sum + {(3-2)+(4-2)}
when i = 2 | sum = sum + {(4-3)}
if we write them all
sum = ( -1-1-1) + (2+ -2-2) + (3+3 -3) + (4+4+4 )
we can see that
the number at index 0 is added to the sum for 0 time and substracted from the sum for 3 time.
the number at index 1 is added to the sum for 1 time and substracted from the sum for 2 time.
the number at index 2 is added to the sum for 2 time and substracted from the sum for 1 time.
the number at index 3 is added to the sum for 3 time and substracted from the sum for 0 time.
so for we can say that,
the number at index i will be added to the sum for i time
and will be substracted from the sum for (n-i)-1 time
Then the generalized expression for
each element will be
sum = sum + (i*a[i]) ā ((n-i)-1)*a[i];
I am required to implement the NTRU Public Key Cyptosystem as part of my final year University project. I'm trying to implement an algorithm that multiplies long polynomials via recursion, however I'm quite bogged down in trying to understand the pseudo-code.
Algorithm PolyMult(c, b, a, n, N)
Require: N, n, and the polynomial operands, b and c.
PolyMult returns the product polynomial a through the argument list
PolyMult(a,b,c,n,N)
{
1. if(...)
2. {
3. ...
4. ...
5. ...
6. ...
7. }
8. else
9. {
10. n1 = n/2;
11. n2 = n-n1;
12. b = b1+b2*X^(n1);
13. c = c1+c2*X^(n1);
14. B = b1+b2;
15. C = c1+c2;
16. PolyMult(a1,b1,c1,n1,N);// a1 = b1*c1
17. PolyMult(a2,b2,c2,n2,N);// a2=b2*c2
18. PolyMult(a3,B,C,n2,N);// a3 = B*C=(b1+b2)*(c1+c2)
19. a = a1 + (a3-a1-a2)*X^(n1) + a2*X^(2*n1);
20.}
}
Note that N, n, n1 and n2 are all type int. a,a1,a2,b,b1,b2,c,c1,c2,B,C are all polynomials and are represented as arrays.
On lines 16, 17 and 18, the function PolyMult is called with arguments of a1,b1,c1,n1,N then a2,b2,c2,n2,N and finally a3,B,C,n2,N respectively. I have initialised the arrays a1,b1,c1 before line 16, then I pass those into PolyMult itself (recursion starts here!) and return an answer and store it in some temporary array, say for example I implement line 16 as follows:
int z[] = PolyMult(a1,b1,c1,n1,N);
Now my question is: when will the polynomial stored in array z[] be used again in the program, I see no indication that it will be used again from the pseudo-code, but if array z[] is not used again in the program, what is the point of line 16 and recursion all together? How should I implement lines 16-18?
So to repeat, when and how will the polynomial stored in array z be used again in the program? And how should I go about implementing lines 16-18?
For more insight a full description of the pseudo-code can be found on page 3 of this article: http://www.ntru.com/cryptolab/pdf/NTRUTech010.pdf.
In the pseudo-code, the result is "returned" by storing it into the a[] array, which is given as parameter. PolyMult(a1, b1, c1, n1, N) stores its result in a1[].
That multiplication technique is simply the Karatsuba multiplication, applied on polynomials (which makes it even easier, because there is no carry in polynomials). See this Wikipedia article for pointers.
Personally, I think that it is easier to understand from the math alone, rather than by following pseudo-code.
I work for NTRU, so I'm pleased to see this interest.
I'm not sure what parameter set you're using, but for a lot of NTRU parameter sets we find that the overhead involved in implementing Karatsuba isn't worth it. Say you're multiplying A and B. For NTRUEncrypt convolution operations, one of the polynomials involved is always binary or trinary. Say that's A. Then each coefficient in the result is a sum of a subset of
the coefficients of B. If you store A as the array of indices of coefficients that are non-zero, rather than storing it as an array of 1s and 0s, and if A is not too dense, then it's quicker to work through the array of indices than to do Karatsuba. The code is smaller and simpler too.
May I ask what university you're studying at?