Extracting rightmost N bits of an integer - java

In the yester Code Jam Qualification round http://code.google.com/codejam/contest/dashboard?c=433101#s=a&a=0 , there was a problem called Snapper Chain. From the contest analysis I came to know the problem requires bit twiddling stuff like extracting the rightmost N bits of an integer and checking if they all are 1. I saw a contestant's(Eireksten) code which performed the said operation like below:
(((K&(1<<N)-1))==(1<<N)-1)
I couldn't understand how this works. What is the use of -1 there in the comparison?. If somebody can explain this, it would be very much useful for us rookies. Also, Any tips on identifying this sort of problems would be much appreciated. I used a naive algorithm to solve this problem and ended up solving only the smaller data set.(It took heck of a time to compile the larger data set which is required to be submitted within 8 minutes.). Thanks in advance.

Let's use N=3 as an example. In binary, 1<<3 == 0b1000. So 1<<3 - 1 == 0b111.
In general, 1<<N - 1 creates a number with N ones in binary form.
Let R = 1<<N-1. Then the expression becomes (K&R) == R. The K&R will extract the last N bits, for example:
101001010
& 111
————————————
000000010
(Recall that the bitwise-AND will return 1 in a digit, if and only if both operands have a 1 in that digit.)
The equality holds if and only if the last N bits are all 1. Thus the expression checks if K ends with N ones.

For example: N=3, K=101010
1. (1<<N) = 001000 (shift function)
2. (1<<N)-1 = 000111 (http://en.wikipedia.org/wiki/Two's_complement)
3. K&(1<<N)-1 = 0000010 (Bitmask)
4. K&(1<<N)-1 == (1<<N)-1) = (0000010 == 000111) = FALSE

I was working through the Snapper Chain problem and came here looking for an explanation on how the bit twiddling algorithm I came across in the solutions worked. I found some good info but it still took me a good while to figure it out for myself, being a bitwise noob.
Here's my attempt at explaining the algorithm and how to come up with it. If we enumerate all the possible power and ON/OFF states for each snapper in a chain, we see a pattern. Given the test case N=3, K=7 (3 snappers, 7 snaps), we show the power and ON/OFF states for each snapper for every kth snap:
1 2 3
0b:1 1 1.1 1.0 0.0 -> ON for n=1
0b:10 2 1.0 0.1 0.0
0b:11 3 1.1 1.1 1.0 -> ON for n=1, n=2
0b:100 4 1.0 0.0 1.0
0b:101 5 1.1 1.0 1.0 -> ON for n=1
0b:110 6 1.0 0.1 0.1
0b:111 7 1.1 1.1 1.1 -> ON for n=2, n=3
The lightbulb is on when all snappers are on and receiving power, or when we have a kth snap resulting in n 1s. Even more simply, the bulb is on when all of the snappers are ON, since they all must be receiving power to be ON (and hence the bulb). This means for every k snaps, we need n 1s.
Further, you can note that k is all binary 1s not only for k=7 that satisfies n=3, but for k=3 that satisifes n=2 and k=1 that satisifes n=1. Further, for n = 1 or 2 we see that every number of snaps that turns the bulb on, the last n digits of k are always 1. We can attempt to generalize that all ks that satisfy n snappers will be a binary number ending in n digits of 1.
We can use the expression noted by an earlier poster than 1 << n - 1 always gives us n binary digits of 1, or in this case, 1 << 3 - 1 = 0b111. If we treat our chain of n snappers as a binary number where each digit represents on/off, and we want n digits of one, this gives us our representation.
Now we want to find those cases where 1 << n - 1 is equal to some k that ends in n binary digits of 1, which we do by performing a bitwise-and: k & (1 << n - 1) to get the last n digits of k, and then comparing that to 1 << n - 1.
I suppose this type of thinking comes more naturally after working with these types of problems a lot, but it's still intimidating to me and I doubt I would ever have come up with such a solution by myself!
Here's my solution in perl:
$tests = <>;
for (1..$tests) {
($n, $k) = split / /, <>;
$m = 1 << $n - 1;
printf "Case #%d: %s\n", $_, (($k & $m) == $m) ? 'ON' : 'OFF';
}

I think we can recognize this kind of problem by calculating the answer by hand first, for some series of N (for example 1,2,3,..). After that, we will recognize the state change and then write a function to automate the process (first function). Run the program for some inputs, and notice the pattern.
When we get the pattern, write the function representing the pattern (second function), and compare the output of the first function and the second function.
For the Code Jam case, we can run both function against the small dataset, and verify the output. If it is identical, we have a high probability that the second function can solve the large dataset in time.

Related

Competitive Coding - Clearing all levels with minimum cost : Not passing all test cases

I was solving problems on a competitive coding website when I came across this. The problem states that:
In this game there are N levels and M types of available weapons. The levels are numbered from 0 to N-1 and the weapons are numbered from 0 to M-1 . You can clear these levels in any order. In each level, some subset of these M weapons is required to clear this level. If in a particular level, you need to buy x new weapons, you will pay x^2 coins for it. Also note that you can carry all the weapons you have currently to the next level . Initially, you have no weapons. Can you find out the minimum coins required such that you can clear all the levels?
Input Format
The first line of input contains 2 space separated integers:
N = the number of levels in the game
M = the number of types of weapons
N lines follows. The ith of these lines contains a binary string of length M. If the jth character of
this string is 1 , it means we need a weapon of type j to clear the ith level.
Constraints
1 <= N <=20
1<= M <= 20
Output Format
Print a single integer which is the answer to the problem.
Sample TestCase 1
Input
1 4
0101
Output
4
Explanation
There is only one level in this game. We need 2 types of weapons - 1 and 3. Since, initially Ben
has no weapons he will have to buy these, which will cost him 2^2 = 4 coins.
Sample TestCase 2
Input
3 3
111
001
010
Output
3
Explanation
There are 3 levels in this game. The 0th level (111) requires all 3 types of weapons. The 1st level (001) requires only weapon of type 2. The 2nd level requires only weapon of type 1. If we clear the levels in the given order(0-1-2), total cost = 3^2 + 0^2 + 0^2 = 9 coins. If we clear the levels in the order 1-2-0, it will cost = 1^2 + 1^2 + 1^2 = 3 coins which is the optimal way.
Approach
I was able to figure out that we can calculate the minimum cost by traversing the Binary Strings in a way that we purchase minimum possible weapons at each level.
One possible way could be traversing the array of binary strings and calculating the cost for each level while the array is already arranged in the correct order. The correct order should be when the Strings are already sorted i.e. 001, 010, 111 as in case of the above test case. Traversing the arrays in this order and summing up the cost for each level gives the correct answer.
Also, the sort method in java works fine to sort these Binary Strings before running a loop on the array to sum up cost for each level.
Arrays.sort(weapons);
This approach work fine for some of the test cases, however more than half of the test cases are still failing and I can't understand whats wrong with my logic. I am using bitwise operators to calculate the number of weapons needed at each level and returning their square.
Unfortunately, I cannot see the test cases that are failing. Any help is greatly appreciated.
This can be solved by dynamic programming.
The state will be the bit mask of weapons we currently own.
The transitions will be to try clearing each of the n possible levels in turn from the current state, acquiring the additional weapons we need and paying for them.
In each of the n resulting states, we take the minimum cost of the current way to achieve it and all previously observed ways.
When we already have some weapons, some levels will actually require no additional weapons to be bought; such transitions will automatically be disregarded since in such case, we arrive at the same state having paid the same cost.
We start at the state of m zeroes, having paid 0.
The end state is the bitwise OR of all the given levels, and the minimum cost to get there is the answer.
In pseudocode:
let mask[1], mask[2], ..., mask[n] be the given bit masks of the n levels
p2m = 2 to the power of m
f[0] = 0
all f[1], f[2], ..., f[p2m-1] = infinity
for state = 0, 1, 2, ..., p2m-1:
current_cost = f[state]
current_ones = popcount(state) // popcount is the number of 1 bits
for level = 1, 2, ..., n:
new_state = state | mask[level] // the operation is bitwise OR
new_cost = current_cost + square (popcount(new_state) - current_ones)
f[new_state] = min (f[new_state], new_cost)
mask_total = mask[1] | mask[2] | ... | mask[n]
the answer is f[mask_total]
The complexity is O(2^m * n) time and O(2^m) memory, which should be fine for m <= 20 and n <= 20 in most online judges.
The dynamic optimization idea by #Gassa could be extended by using A* by estimating min and max of the remaining cost, where
minRemaining(s)=bitCount(maxState-s)
maxRemaining(s)=bitCount(maxState-s)^2
Start with a priority queue - and base it on cost+minRemaining - with the just the empty state, and then replace a state from this queue that has not reached maxState with at most n new states based the n levels:
Keep track bound=min(cost(s)+maxRemaining(s)) in queue,
and initialize all costs with bitCount(maxState)^2+1
extract state with lowest cost
if state!=maxState
remove state from queue
for j in 1..n
if (state|level[j]!=state)
cost(state|level[j])=min(cost(state|level[j]),
cost(state)+bitCount(state|level[j]-state)^2
if cost(state|level[j])+minRemaining(state|level[j])<=bound
add/replace state|level[j] in queue
else break
The idea is to skip dead-ends. So consider an example from a comment
11100 cost 9 min 2 max 4
11110 cost 16 min 1 max 1
11111 cost 25 min 0 max 0
00011 cost 4 min 3 max 9
bound 13
remove 00011 and replace with 11111 (skipping 00011 since no change)
11111 cost 13 min 0 max 0
11100 cost 9 min 2 max 4
11110 cost 16 min 1 max 1
remove 11100 and replace with 11110 11111 (skipping 11100 since no change):
11111 cost 13 min 0 max 0
11110 cost 10 min 1 max 1
bound 11
remove 11110 and replace with 11111 (skipping 11110 since no change)
11111 cost 11 min 0 max 0
bound 11
Number of operations should be similar to dynamic optimization in the worst case, but in many cases it will be better - and I don't know if the worst case can occur.
The logic behind this problem is that every time you have to find the minimum count of set bits corresponding to a binary string which will contain the weapons so far got in the level.
For ex :
we have data as
4 3
101-2 bits
010-1 bits
110-2 bits
101-2 bits
now as 010 has min bits we compute cost for it first then update the current pattern (by using bitwise OR) so current pattern is 010
next we find the next min set bits wrt to current pattern
i have used the logic by first using XOR for current pattern and the given number then using AND with the current number(A^B)&A
so the bits become like this after the operation
(101^010)&101->101-2 bit
(110^010)&110->100-1 bit
now we know the min bit is 110 we pick it and compute the cost ,update the pattern and so on..
This method returns the cost of a string with respect to current pattern
private static int computeCost(String currPattern, String costString) {
int a = currPattern.isEmpty()?0:Integer.parseInt(currPattern, 2);
int b = Integer.parseInt(costString, 2);
int cost = 0;
int c = (a ^ b) & b;
cost = (int) Math.pow(countSetBits(c), 2);
return cost;
}

Bitwise alternative modulo gives different results

I'm looking for an alternative for modulo, for Java. The reason is performance.
I have to run a script which loops some time and performs a modulo calculation each loop. Now I read on quite some websites there is a bitwise solution for this, but it gives different results in case of 1 % 3.
1 % 3; // results in 1
1 & (3-1); // results in 0
Can somebody explain this? Most calculations went fine, but this is one combination I found which does not give equal results.
For positive integers, i & (n-1) is equivalent to i % n if n is a power of 2. It doesn't work for all numbers. Otherwise we'd all be doing it the fast way all of the time.

Optimal merging of triplets

I'm trying to come up with an algorithm for the following problem :
I've got a collection of triplets of integers - let's call these integers A, B, C. The value stored inside can be big, so generally it's impossible to create an array of size A, B, or C. The goal is to minimize the size of the collection. To do this, we're provided a simple rule that allows us to merge the triplets :
For two triplets (A, B, C) and (A', B', C'), remove the original triplets and place the triplet (A | A', B, C) if B == B' and C = C', where | is bitwise OR. Similar rules hold for B and C also.
In other words, if two values of two triplets are equal, remove these two triplets, bitwise OR the third values and place the result to the collection.
The greedy approach is usually misleading in similar cases and so it is for this problem, but I can't find a simple counterexample that'd lead to a correct solution. For a list with 250 items where the correct solution is 14, the average size computed by greedy merging is about 30 (varies from 20 to 70). The sub-optimal overhead gets bigger as the list size increases.
I've also tried playing around with set bit counts, but I've found no meaningful results. Just the obvious fact that if the records are unique (which is safe to assume), the set bit count always increases.
Here's the stupid greedy implementation (it's just a conceptual thing, please don't regard the code style) :
public class Record {
long A;
long B;
long C;
public static void main(String[] args) {
List<Record> data = new ArrayList<>();
// Fill it with some data
boolean found;
do {
found = false;
outer:
for (int i = 0; i < data.size(); ++i) {
for (int j = i+1; j < data.size(); ++j) {
try {
Record r = merge(data.get(i), data.get(j));
found = true;
data.remove(j);
data.remove(i);
data.add(r);
break outer;
} catch (IllegalArgumentException ignored) {
}
}
}
} while (found);
}
public static Record merge(Record r1, Record r2) {
if (r1.A == r2.A && r1.B == r2.B) {
Record r = new Record();
r.A = r1.A;
r.B = r1.B;
r.C = r1.C | r2.C;
return r;
}
if (r1.A == r2.A && r1.C == r2.C) {
Record r = new Record();
r.A = r1.A;
r.B = r1.B | r2.B;
r.C = r1.C;
return r;
}
if (r1.B == r2.B && r1.C == r2.C) {
Record r = new Record();
r.A = r1.A | r2.A;
r.B = r1.B;
r.C = r1.C;
return r;
}
throw new IllegalArgumentException("Unable to merge these two records!");
}
Do you have any idea how to solve this problem?
This is going to be a very long answer, sadly without an optimal solution (sorry). It is however a serious attempt at applying greedy problem solving to your problem, so it may be useful in principle. I didn't implement the last approach discussed, perhaps that approach can yield the optimal solution -- I can't guarantee that though.
Level 0: Not really greedy
By definition, a greedy algorithm has a heuristic for choosing the next step in a way that is locally optimal, i.e. optimal right now, hoping to reach the global optimum which may or may not be possible always.
Your algorithm chooses any mergable pair and merges them and then moves on. It does no evaluation of what this merge implies and whether there is a better local solution. Because of this I wouldn't call your approach greedy at all. It is just a solution, an approach. I will call it the blind algorithm just so that I can succinctly refer to it in my answer. I will also use a slightly modified version of your algorithm, which, instead of removing two triplets and appending the merged triplet, removes only the second triplet and replaces the first one with the merged one. The order of the resulting triplets is different and thus the final result possibly too. Let me run this modified algorithm over a representative data set, marking to-be-merged triplets with a *:
0: 3 2 3 3 2 3 3 2 3
1: 0 1 0* 0 1 2 0 1 2
2: 1 2 0 1 2 0* 1 2 1
3: 0 1 2*
4: 1 2 1 1 2 1*
5: 0 2 0 0 2 0 0 2 0
Result: 4
Level 1: Greedy
To have a greedy algorithm, you need to formulate the merging decision in a way that allows for comparison of options, when multiple are available. For me, the intuitive formulation of the merging decision was:
If I merge these two triplets, will the resulting set have the maximum possible number of mergable triplets, when compared to the result of merging any other two triplets from the current set?
I repeat, this is intuitive for me. I have no proof that this leads to the globally optimal solution, not even that it will lead to a better-or-equal solution than the blind algorithm -- but it fits the definition of greedy (and is very easy to implement). Let's try it on the above data set, showing between each step, the possible merges (by indicating the indices of triplet pairs) and resulting number of mergables for each possible merge:
mergables
0: 3 2 3 (1,3)->2
1: 0 1 0 (1,5)->1
2: 1 2 0 (2,4)->2
3: 0 1 2 (2,5)->2
4: 1 2 1
5: 0 2 0
Any choice except merging triplets 1 and 5 is fine, if we take the first pair, we get the same interim set as with the blind algorithm (I will this time collapse indices to remove gaps):
mergables
0: 3 2 3 (2,3)->0
1: 0 1 2 (2,4)->1
2: 1 2 0
3: 1 2 1
4: 0 2 0
This is where this algorithm gets it differently: it chooses the triplets 2 and 4 because there is still one merge possible after merging them in contrast to the choice made by the blind algorithm:
mergables
0: 3 2 3 (2,3)->0 3 2 3
1: 0 1 2 0 1 2
2: 1 2 0 1 2 1
3: 1 2 1
Result: 3
Level 2: Very greedy
Now, a second step from this intuitive heuristic is to look ahead one merge further and to ask the heuristic question then. Generalized, you would look ahead k merges further and apply the above heuristic, backtrack and decide the best option. This gets very verbose by now, so to exemplify, I will only perform one step of this new heuristic with lookahead 1:
mergables
0: 3 2 3 (1,3)->(2,3)->0
1: 0 1 0 (2,4)->1*
2: 1 2 0 (1,5)->(2,4)->0
3: 0 1 2 (2,4)->(1,3)->0
4: 1 2 1 (1,4)->0
5: 0 2 0 (2,5)->(1,3)->1*
(2,4)->1*
Merge sequences marked with an asterisk are the best options when this new heuristic is applied.
In case a verbal explanation is necessary:
Instead of checking how many merges are possible after each possible merge for the starting set; this time we check how many merges are possible after each possible merge for each resulting set after each possible merge for the starting set. And this is for lookahead 1. For lookahead n, you'd be seeing a very long sentence repeating the part after each possible merge for each resulting set n times.
Level 3: Let's cut the greed
If you look closely, the previous approach has a disastrous perfomance for even moderate inputs and lookaheads(*). For inputs beyond 20 triplets anything beyond 4-merge-lookahead takes unreasonably long. The idea here is to cut out merge paths that seem to be worse than an existing solution. If we want to perform lookahead 10, and a specific merge path yields less mergables after three merges, than another path after 5 merges, we may just as well cut the current merge path and try another one. This should save a lot of time and allow large lookaheads which would get us closer to the globally optimal solution, hopefully. I haven't implemented this one for testing though.
(*): Assuming a large reduction of input sets is possible, the number of merges is
proportional to input size, and
lookahead approximately indicates how much you permute those merges.
So you have choose lookahead from |input|, which is
the binomial coefficient that for lookahead ≪ |input| can be approximated as
O(|input|^lookahead) -- which is also (rightfully) written as you are thoroughly screwed.
Putting it all together
I was intrigued enough by this problem that I sat and coded this down in Python. Sadly, I was able to prove that different lookaheads yield possibly different results, and that even the blind algorithm occasionally gets it better than lookahead 1 or 2. This is a direct proof that the solution is not optimal (at least for lookahead ≪ |input|). See the source code and helper scripts, as well as proof-triplets on github. Be warned that, apart from memoization of merge results, I made no attempt at optimizing the code CPU-cycle-wise.
I don't have the solution, but I have some ideas.
Representation
A helpful visual representation of the problem is to consider the triplets as points of the 3D space. You have integers, so the records will be nodes of a grid. And two records are mergeable if and only if the nodes representing them sit on the same axis.
Counter-example
I found an (minimal) example where a greedy algorithm may fail. Consider the following records:
(1, 1, 1) \
(2, 1, 1) | (3, 1, 1) \
(1, 2, 1) |==> (3, 2, 1) |==> (3, 3, 1)
(2, 2, 1) | (2, 2, 2) / (2, 2, 2)
(2, 2, 2) /
But by choosing the wrong way, it might get stuck at three records:
(1, 1, 1) \
(2, 1, 1) | (3, 1, 1)
(1, 2, 1) |==> (1, 2, 1)
(2, 2, 1) | (2, 2, 3)
(2, 2, 2) /
Intuition
I feel that this problem is somehow similar to finding the maximal matching in a graph. Most of those algorithms finds the optimal solution by begining with an arbitrary, suboptimal solution, and making it 'more optimal' in each iteration by searching augmenting paths, which have the following properties:
they are easy to find (polynomial time in the number of nodes),
an augmenting path and the current solution can be crafted to a new solution, which is strictly better than the current one,
if no augmenting path is found, the current solution is optimal.
I think that the optimal solution in your problem can be found in the similar spirit.
Based on your problem description:
I'm given a bunch of events in time that's usually got some pattern.
The goal is to find the pattern. Each of the bits in the integer
represents "the event occurred in this particular year/month/day". For
example, the representation of March 7, 2014 would be [1 <<
(2014-1970), 1 << 3, 1 << 7]. The pattern described above allows us to
compress these events so that we can say 'the event occurred every 1st
in years 2000-2010'. – Danstahr Mar 7 at 10:56
I'd like to encourage you with the answers that MicSim has pointed at, specifically
Based on your problem description, you should check out this SO
answers (if you didn't do it already):
stackoverflow.com/a/4202095/44522 and
stackoverflow.com/a/3251229/44522 – MicSim Mar 7 at 15:31
The description of your goal is much more clear than the approach you are using. I'm scared that you won't get anywhere with the idea of merging. Sounds scary. The answer you get depends upon the order that you manipulate your data. You don't want that.
It seems you need to keep data and summarize. So, you might try counting those bits instead of merging them. Try clustering algorithms, sure, but more specifically try regression analysis. I should think you would get great results using a correlation analysis if you create some auxiliary data. For example, if you create data for "Monday", "Tuesday", "first Monday of the month", "first Tuesday of the month", ... "second Monday of the month", ... "even years", "every four years", "leap years", "years without leap days", ... "years ending in 3", ...
What you have right now is "1st day of the month", "2nd day of the month", ... "1st month of the year", "2nd month of the year", ... These don't sound like sophisticated enough descriptions to find the pattern.
If you feel it is necessary to continue the approach you have started, then you might treat it more as a search than a merge. What I mean is that you're going to need a criteria/measure for success. You can do the merge on the original data while requiring strictly that A==A'. Then repeat the merge on the original data while requiring B==B'. Likewise C==C'. Finally compare the results (using the criteria/measure). Do you see where this is going? Your idea of bit counting could be used as a measure.
Another point, you could do better at performance. Instead of double-looping through all your data and matching up pairs, I'd encourage you to do single passes through the data and sort it into bins. The HashMap is your friend. Make sure to implement both hashCode() and equals(). Using a Map you can sort data by a key (say where month and day both match) and then accumulate the years in the value. Oh, man, this could be a lot of coding.
Finally, if the execution time isn't an issue and you don't need performance, then here's something to try. Your algorithm is dependent on the ordering of the data. You get different answers based on different sorting. Your criteria for success is the answer with the smallest size after merging. So, repeatedly loop though this algorithm: shuffle the original data, do your merge, save the result. Now, every time through the loop keep the result which is the smallest so far. Whenever you get a result smaller than the previous minimum, print out the number of iterations, and the size. This is a very simplistic algorithm, but given enough time it will find small solutions. Based on your data size, it might take too long ...
Kind Regards,
-JohnStosh

Strange Numbers

Here are the properties of "strange numbers" in the problem I'm doing:
1) They have an even number of decimal digits (no leading zeros).
2) Define left half to be the number represented by the most significant half of digits of the original number, and right half to be the one represented by the least significant half. The right half may have leading zeros. The strange number is the square of the sum of its halves: 81 = (8 + 1)^2
Here are some other examples: 998001 = (998 + 001)^2, 3025 = (30 + 25)^2
How can I write a program that lists all the strange numbers in increasing order that have no more than 18 decimal digits?
I understand how to do this by looking at all the possibilities (numbers with 2 digits, 4 digits, 6 digits, ... , 18 digits), but that would take days to run. Are there any patterns to this, so I can output all the strange numbers in a matter of seconds? I would prefer answers in Java, but pseudo code is okay also.
All these 'strange' numbers are perfect squares. So you can start by going through all the numbers and squaring them (until the square has more than 18 digits). And for each square, check to see if it is 'strange'.
Edit
I'll also add that the reason this speeds things up so much is that it changes the solution from O(n) to O(√n)
Besides #spatulamania's speed-up, you can use modulo arithmetic to further speed up the checks.
To check every perfect square, you'll have to split the number into the two parts, add them, square the sum and compare it with the original number. (I'll name this as "full-check")
Instead, you can first check only the last digits of the two parts (and square their sum). For example, for number 99980001, take digits 8 and 1, take the square of (8+1)^2 = 9^2 = 81 and test that the last digit (1 in this case), is same as the last digit of 99980001 (I'll name this as "small-check"). If yes, then proceed with the full-check.
Since there are only 10x10=100 such combinations, this just needs to be done once. You'll create an array of acceptable combinations, that you can use:
0 0
0 1
8 1
4 4
8 4
0 5
0 6
8 6
4 9
8 9
Using this, you'll need to do only the "small-check" for about 82% of the perfect squares (those that fail the small-check) and both checks for the rest 18% (that pass the small-check, so "full-check" will be needed too). Therefore, if the "small-check" can be done fast enough, you'll gain some speed.
You may find even faster to expand this table for the last 2 digits of the two parts and use it (when n is large enough).
class strange_number
{
int number(int n)
{
int x = n;
String a = Integer.toString(n);
int d = a.length();
if(((int)(Math.pow(((x%(int)(Math.pow(10,d/2)))+(x/(int)(Math.pow(10,d/2)))),2))) == x)
return 1;
else
return 0;
}
}
can try this way. This may help u.

Problems with prime numbers

I am trying to write a program to find the largest prime factor of a very large number, and have tried several methods with varying success. All of the ones I have found so far have been unbelievably slow. I had a thought, and am wondering if this is a valid approach:
long number = input;
while(notPrime(number))
{
number = number / getLowestDivisiblePrimeNumber();
}
return number;
This approach would take an input, and would do the following:
200 -> 100 -> 50 -> 25 -> 5 (return)
90 -> 45 -> 15 -> 5 (return)
It divides currentNum repeatedly by the smallest divisible number (most often 2, or 3) until currentNum itself is prime (there is no divisible prime number less than the squareroot of currentNum), and assumes this is the largest prime factor of the original input.
Will this always work? If not, can someone give me a counterexample?
-
EDIT: By very large, I mean about 2^40, or 10^11.
The method will work, but will be slow. "How big are your numbers?" determines the method to use:
Less than 2^16 or so: Lookup table.
Less than 2^70 or so: Sieve of Atkin. This is an optimized version of the more well known Sieve of Eratosthenes. Edit: Richard Brent's modification of Pollard's rho algorithm may be better in this case.
Less than 10^50: Lenstra elliptic curve factorization
Less than 10^100: Quadratic Sieve
More than 10^100: General Number Field Sieve
This will always work because of the Unique Prime Factorization Theorem.
Certainly it will work (see Mark Byers' answer), but for "very large" inputs it may take far too long. You should note that your call to getLowestDivisiblePrimeNumber() conceals another loop, so this runs at O(N^2), and that depending on what you mean by "very large" it may have to work on BigNums which will be slow.
You could speed it up a little, by noting that your algorithm need never check factors smaller than the last one found.
You are trying to find the prime factors of a number. What you are proposing will work, but will still be slow for large numbers.... you should be thankful for this, since most modern security is predicated on this being a difficult problem.
From a quick search I just did, the fastest known way to factor a number is by using the Elliptic Curve Method.
You could try throwing your number at this demo: http://www.alpertron.com.ar/ECM.HTM .
If that convinces you, you could try either stealing the code (that's no fun, they provide a link to it!) or reading up on the theory of it elsewhere. There's a Wikipedia article about it here: http://en.wikipedia.org/wiki/Lenstra_elliptic_curve_factorization but I'm too stupid to understand it. Thankfully, it's your problem, not mine! :)
The thing with Project Euler is that there is usually an obvious brute-force method to do the problem, which will take just about forever. As the questions become more difficult, you will need to implement clever solutions.
One way you can solve this problem is to use a loop that always finds the smallest (positive integer) factor of a number. When the smallest factor of a number is that number, then you've found the greatest prime factor!
Detailed Algorithm description:
You can do this by keeping three variables:
The number you are trying to factor (A)
A current divisor store (B)
A largest divisor store (C)
Initially, let (A) be the number you are interested in - in this case, it is 600851475143. Then let (B) be 2. Have a conditional that checks if (A) is divisible by (B). If it is divisible, divide (A) by (B), reset (B) to 2, and go back to checking if (A) is divisible by (B). Else, if (A) is not divisible by (B), increment (B) by +1 and then check if (A) is divisible by (B). Run the loop until (A) is 1. The (3) you return will be the largest prime divisor of 600851475143.
There are numerous ways you could make this more effective - instead of incrementing to the next integer, you could increment to the next necessarily prime integer, and instead of keeping a largest divisor store, you could just return the current number when its only divisor is itself. However, the algorithm I described above will run in seconds regardless.
The implementation in python is as follows:-
def lpf(x):
lpf = 2;
while (x > lpf):
if (x%lpf==0):
x = x/lpf
lpf = 2
else:
lpf+=1;
print("Largest Prime Factor: %d" % (lpf));
def main():
x = long(raw_input("Input long int:"))
lpf(x);
return 0;
if __name__ == '__main__':
main()
Example: Let's find the largest prime factor of 105 using the method described above.
Let (A) = 105. (B) = 2 (we always start with 2), and we don't have a value for (C) yet.
Is (A) divisible by (B)? No. Increment (B) by +1: (B) = 3. Is Is (A) divisible by (B)? Yes. (105/3 = 35). The largest divisor found so far is 3. Let (C) = 3. Update (A) = 35. Reset (B) = 2.
Now, is (A) divisible by (B)? No. Increment (B) by +1: (B) = 3. Is (A) divisible by (B)? No. Increment (B) by +1: (B) = 4. Is (A) divisible by (B)? No. Increment (B) by +1: (B) = 5. Is (A) divisible by (B)? Yes. (35/5 = 7). The largest divisor we found previously is stored in (C). (C) is currently 3. 5 is larger than 3, so we update (C) = 5. We update (A)=7. We reset (B)=2.
Then we repeat the process for (A), but we will just keep incrementing (B) until (B)=(A), because 7 is prime and has no divisors other than itself and 1. (We could already stop when (B)>((A)/2), as you cannot have integer divisors greater than half of a number - the smallest possible divisor (other than 1) of any number is 2!)
So at that point we return (A) = 7.
Try doing a few of these by hand, and you'll get the hang of the idea

Categories