Improved solution of problem 4 from Project euler - java

I have come across many many solutions in StackOverflow for problem 4 in project Euler. My question is not about asking again a solution to problem 4 from project Euler which is already implemented in StackOverflow. Instead, i have come across an improved solution Improved solution by ROMANIA_engineer.
I have two questions against the improved solution. Anyways I have posted the solution below, for people to look into it.
public static void main(String[] args) {
int max = -1;
for ( int i = 999 ; i >= 100 ; i--) {
if ( max >= i*999 ) {
break;
}
for (int j = 999 ; j >= i ; j-- ) {
int p = i * j;
if ( max < p && isPalindrome(p) ) {
max = p;
}
}
}
System.out.println(max > -1? max : "No palindrome found");
}
Questions
Why there is a condition (max >= i * 999) ?. What are we going to achieve, by including such an inexpensive operation?
From the below snippet,
for (int j = 999 ; j >= i ; j-- ) {
int p = i * j;
if ( max < p && isPalindrome(p) ) {
max = p;
}
}
Instead of j >= 100, it is given j >= i. I can see a lot of time is saved, however, I wanted to know the intention behind it.

To answer question 1, the reason why there is a check (max >= i * 999) is that you may have already stumbled upon a product of two 3-digit numbers that is a palindrome but greater than i * 999. Since the outer for loop starts at i = 999, once you've found a new max, there is a possibility that the new max is greater than the max possible palindrome product from the decremented i value in the next iteration. For example, if we found a palindrome product of 926 * 998 where i = 926 and j = 998 and the new max was set to this product. Note that this is just a hypothetical, I have no idea if the product is even a palindrome. Then the inner for loop finishes on the iteration i = 926. Then on the next iteration of the outer for loop, i is now 925, and since 925 * 999 is less than 926 * 998, there's no need to continue finding the max palindrome product because it's already been found. The reason is that at this point 925 * 999 is the largest possible palindrome product that can be calculated going forward.
To answer question 2, the reason for j >= i is to avoid repeating the calculation of products. For example, let's say the condition was j >= 100 instead. On the first iteration of the inner for loop when j is 999 and i is also 999. We will end up calculating, possibly, products from 999 * 999, 999 * 998, all the way to 999 * 100. However, if the inner for loop is ever reached to a point where i is now 100 and j is 999, then you'll ultimately repeat the calculation 100 * 999. Note that this is just an example, it may not even get to this point.

Related

Copying books using dynamic programming

I am implementing the dynamic programming solution for copying books problem. The idea for the solution is taken from here and here.
Problem statement:
Before the invention of book-printing, it was very hard to make a copy
of a book. All the contents had to be re-written by hand by so called
scribers. The scriber had been given a book and after several months
he finished its copy. One of the most famous scribers lived in the
15th century and his name was Xaverius Endricus Remius Ontius
Xendrianus (Xerox). Anyway, the work was very annoying and boring. And
the only way to speed it up was to hire more scribers.
Once upon a time, there was a theater ensemble that wanted to play
famous Antique Tragedies. The scripts of these plays were divided into
many books and actors needed more copies of them, of course. So they
hired many scribers to make copies of these books. Imagine you have m
books (numbered 1, 2, ...., m) that may have different number of pages
( p_1, p_2, ..., p_m) and you want to make one copy of each of them.
Your task is to divide these books among k scribes, k <= m. Each book
can be assigned to a single scriber only, and every scriber must get a
continuous sequence of books. That means, there exists an increasing
succession of numbers 0 = b_0 < b_1 < b_2, ... < b_{k-1} <= b_k = m$
such that i-th scriber gets a sequence of books with numbers between
bi-1+1 and bi. The time needed to make a copy of all the books is
determined by the scriber who was assigned the most work. Therefore,
our goal is to minimize the maximum number of pages assigned to a
single scriber. Your task is to find the optimal assignment.
I am able to obtain the optimal solution for the problem described iteratively, but unable to use that to find the required solution for the problem, that is:
Sample input:
2
9 3
100 200 300 400 500 600 700 800 900
5 4
100 100 100 100 100
Sample Output
100 200 300 400 500 / 600 700 / 800 900
100 / 100 / 100 / 100 100
Where 2 is the number of datasets, 9 is the number of books and 3 is the number of scribes to assign the books to.
Here is my output, for the respective inputs:
100 100 100
300 300 300
600 600 600
1000 700 700
1500 900 900
2100 1100 1100
2800 1300 1300
3600 1500 1500
4500 1700 1700
100 100 100 100
200 200 200 200
300 300 300 300
400 300 300 300
500 300 300 300
For the first solution set, I can use 1700 as the optimal number of page assignments to each user and keep on assigning the book pages until, Current scribe page sum >= 1700. However, the second solution does not have any pattern to it whatsoever?
Here is my code to generate the solution:
private void processScribes(){
int[][] bookScribe = new int[numOfBooks][numOfScribes];
//set first row to b1 page number
for (int j = 0; j < numOfScribes; ++j)
bookScribe[0][j] = bookPages[0];
//set first column to sum of book page numbers
for (int row = 1; row < numOfBooks; ++row)
bookScribe[row][0] = bookScribe[row - 1][0] + bookPages[row];
//calculate the kth scribe using dp
for (int i = 1; i < numOfBooks; ++i){
for (int j = 1; j < numOfScribes; ++j){
//calculate minimum of maximum page numbers
//from k = l + 1 to i
//calculate sum
int minValue = 1000000;
for (int k = 0; k < i - 1; ++k){
int prevValue = bookScribe[i - k][j - 1];
int max = 0;
int sumOflVals = 0;
for (int l = k + 1; l <= i; ++l){
sumOflVals = sumOflVals + bookPages[l];
}
if (prevValue > sumOflVals){
max = prevValue;
}
else
max = sumOflVals;
if (max < minValue )
minValue = max;
}
if (minValue == 1000000)
minValue = bookScribe[i][0];
//store minvalue at [i][j]
bookScribe[i][j] = minValue;
}
}
//print bookScribes
for (int i = 0; i < numOfBooks; ++i){
for (int j = 0; j < numOfScribes; ++j)
System.out.print(bookScribe[i][j] + " ");
System.out.println();
}
System.out.println();
}
Any pointers here? Is it the interpretation of solution or something is wrong with how I am translating the recurrence in my code?
Not sure of your solution but here is an intuitive recursive approach with memoization. Let there be n books with ith book having pages[i] pages. Also let there be m subscribers. Also let dp[i][j] be the answer to problem if we were given only books i,i+1.....n and there are only j subscribers to do the job. Following is a recursive pseudo code with memoization
//dp[][] is memset to -1 from main
// Assuming books are numbered 1 to n
// change value of MAX based on your constraints
int MAX = 1000000000;
int rec(int position , int sub )
{
// These two are the base cases
if(position > n)
{
if(sub == 0)return 0;
return MAX;
}
if(sub == 0)
{
if(position > n)return 0;
return MAX;
}
// If answer is already computed for this state return it
if(dp[position][sub] != -1)return dp[position][sub];
int ans = MAX,i,sum = 0;
for(i = position; i <= n;i++)
{
sum += pages[i];
// taking the best of all possible solutions
ans = min(ans,max(sum,rec(i+1,sub-1)));
}
dp[position][sub]=ans;
return ans;
}
//from main call rec(1,m) which is your answer
You can convert it to an iterative solution by dynamic programming it will be same complexity in time and space .Space is O(n.m) and time is O(n^2.m).
EDIT
Here have a look at running version of the code on your testcases Book Copying Code . It not only finds optimal answer but also prints the optimal assignment with it ( which I have not included in the pseudo code above). ( click on the top right corner fork and it would run on
your testcases, input format is same as yours ). Output will optimal answer followed by optimal assignment. Do comment if you have doubts regarding the code.

How to determine how often a statement in a nested loop is executed?

I am working through a section of a text on determining complexity of nested loops using recurrence relations. In this particular example I am trying to determine how many times the count variable will be incremented as a function of n.
This is the loop I am analyzing:
for (int i = 1; i <= n; i++) {
int j = n;
while (j > 0) {
count++;
j = j / 2;
}
}
I think I understand that the first line would equate simply to n since it only executes for each value of n but it's the rest of it that I'm having trouble with. I think the answer would be something like n(n/2) except that this example is using integer division so I'm not sure how to represent that mathematically.
I've run through the loop by hand a few times on paper so I know that the count variable should equal 1, 4, 6, 12, 15, and 18 for n values of 1-6. I just can't seem to come up with the formula... Any help would be greatly appreciated!
The loop executes for n in the range [1, n]. It divides by 2 each time for the j variable, which is set to n, so the number of time the inner loop executes is floor(l2(n)) + 1, where l2 is the binary log function. Add up all such values from 1 to n (multiply by n).
The inner j loop adds the location of the first set bit to count.
Each divide by 2 is the same as a right shift until all the bits are zero.
So, 2 would be 10 in binary, and have a value of 2 for the inner loop.
4 would be 100 in binary, and have a value of 3 for the inner loop.
The outer loop seems to just multiply the location of the first set bit by the number itself.
Here is an example with n = 13.
13 in binary is 1101, so the first set bit is at location 4.
4 * 13 = 52. 52 is the final answer.
for (int i = 1; i <= n; i++) {
This loop at the top goes through the loop n times.
int j = n;
while (j > 0) {
count++;
j = j / 2;
}
This loop here goes through the loop log(n) times, noting that it is a base 2 log since you are dividing by 2 each time.
Hence, the total number of counts is n * ceiling(log(n))

Determining an arrays value depending on another array

I'm currently working on one of my assignments, and am looking for some help with the logic for one of my functions.
First off I have a array of numbers to be categorized, then a number interval, this number determines in which position each of the numbers being plotted goes into array2.
ie.
int interval = 2;
for(int i = 0; i < array1.length; i++) {
if((array1[i] > 0) && (array1[i] < interval)) {
array2[0]++;
}
}
However, the number from array1 is 3. I would then need another if statement like so:
...
}else if((array1[i] > 2) && (array1[i] < interval * 2)) {
array2[1]++;
}else if((array1[i] >
As you can start to see the problem with this is that I would need to continue for an infinite range of numbers. So my question is what is an easier way of achieving this goal? Or is there already a library which I can utilize to do so?
I'm sorry if I didn't make this clear enough, also I would prefer if code wasn't given to me. I would appreciate if someone would be able to tell me a more effective way about going about this, thanks in advance!
EDIT:
Assuming that the interval is set to 2, and the numbers from array1 are between 0 and 10, I would need to create a code that would do such:
2 < numFromArray1 > 0 == array2[0]++
4 < numFromArray1 > 2 == array2[1]++
6 < numFromArray1 > 4 == array2[2]++
8 < numFromArray1 > 6 == array2[3]++
10 < numFromArray1 > 8 == array2[4]++
However, the numbers from array1 can be positive or negative, whole or decimal.
Use a nested loop. Obviously it's not infinitely many possibilities for interval because array2 has a fixed size. So if you loop through all the cells in array2, and then do some math to figure out what your conditions need to be... I won't give complete code (you asked me not to, but it would look something like:
for ( ... ) {
for ( ... ) {
if (array1[i] > /* do some math here */ && ... ) {
array2[/* figure out what this should be too */]++;
}
}
}
Hopefully you can figure it out from this.
By the way, if you aren't required to use an array for array2, consider learning about LinkedList<?>for a data structure that can grow in size as you need it to.
http://docs.oracle.com/javase/1.4.2/docs/api/java/util/LinkedList.html
http://www.dreamincode.net/forums/topic/143089-linked-list-tutorial/
Assuming I understood the question correct, and the interval would be 3, than occurrences of 0, 1 and 2 would increase array2[0], occurences of 3, 4 and 5 would increase array2[1] and so on, this would be a solution:
EDIT sorry, you did not want to see code. I can repost it, if you want. Think about a real easy way to determine which category a number will be in. I'll try to give a hint.
Interval = 3;
0,1,2 -> category 0
3,4,5 -> category 1
6,7,8 -> category 2
Once you know the category, it is easy to increment the desired number in array2.
It would look something like that:
for(int i = 0; i < array1.length; i++) {
int category = // determine category here
// increase correct position of array2
}
After some dicussion, here is my code:
for(int i = 0; i < array1.length; i++) {
int category = array1[i] / interval;
array2[category]++;
}
My solution won't work for negative numbers. Also it is not specified how to handle them
Here's what you can do to consider all cases: -
First find out what is the maximum value in your array: - array1.
Your range should be 0 to maxValueInArray1
Then inside your outer for loop, you can have another, that will run from 0 to the (maximum value) / 2. Because, you don't want to check for maximum value * 2 in your interval
And then for each value, you can check for the range, if it is in that range, use array2[j]
For E.G: -
for (...) // Outer loop {
for (int j = 0; j <= maximumValueinArray1 / 2; j++) {
// Make a range for each `j`
// use the `array2[j]` to put value in appropriate range.
}
}
In your inner loop, you might check for this condition, based on following reasoning: -
For interval = 2, and say maximumValueinArray1 is max, your range looks like: -
0 * interval ----- (1 * interval) --> in `array2[0]` (0 to 2)
1 * interval ----- (2 * interval) --> in `array2[1]` (2 to 4)
2 * interval ----- (3 * interval) --> in `array2[2]` (4 to 6)
and so on.
((max / 2) - 1) * interval ----- (max / 2) * interval (`max - 2` to max)
So, try relating these conditions, with the inner loop I posted, and your problem will be solved.
I'm not sure what exactly you're trying to do, but from your code snippets, I can come up with this inner for loop:
//OUTDATED CODE - please see code block in EDIT below
//for(int i = 0; i < array1.length; i++) {
// for (int j = 0; j < 100000; j++) { //or Integer.MAXVALUE or whatever
// if ((array1[i] > (j*2)) && (array1[i] < interval * ((j*2)==0?2:(j*2)) )) {
// array2[j]++;
// }
// }
//}
EDIT: Owing to your recent edit, this is more suitable and you don't have to run an inner loop!:
Loop through array1
For each element in array1, find array2 index by taking floor of element / interval
Add 1 to array2 element at found index.
DON'T LOOK AT THE CODE BELOW =)
for(int i = 0; i < array1.length; i++) {
int index = Math.floor(array1[i] / interval);
array2[index]++;
//the rest are actually not necessary as you just need to get the index
//and the element will be within range, left inclusive (lower <= value < upper)
//int lower_range = Math.floor(array1[i] / interval) * interval;
// //or int lower_range = index * interval;
//int upper_range = Math.ceil(array1[i] / interval) * interval;
//if ((array1[i] > lower_range) && (array1[i] < upper_range)) {
// array2[index]++;
//}
}
The relationships and pattern are hard to figure out. My attempt in interpreting what you want:
How about something like:
if ( array1[i] < interval * (interval - 2) ) {
array2[interval-2]++;
}

Is it normal for quicksort to take 5 hours for a 100,000,000 element array?

Implementing the basic algorithm using last array as a pivot in Java, is it normal for it take 5 hours for sorting a 100,000,000 element array of random numbers?
My system Specs:
Mac OS X Lion 10.7.2 (2011)
Intel Core i5 2.3 GHz
8GB ram
Update2: So I think I am doing something wrong in my other methods since Narendra was able to run the quicksort. Here is the full code I am trying to run.
import java.util.Random;
public class QuickSort {
public static int comparisons = 0;
public static void main(String[] args) {
int size = 100000000;
int[] smallSampleArray = createArrayOfSize(size);
System.out.println("Starting QS1...");
long startTime = System.currentTimeMillis();
quickSort(smallSampleArray,0,size-1);
System.out.println( "Finished QS1 in " + (System.currentTimeMillis() - startTime)+ " seconds");
System.out.println("Number of comparisons for QS1: " + comparisons);
}
public static int[] createArrayOfSize(int arraySize) {
int[] anArray = new int[arraySize];
Random random = new Random();
for(int x=0; x < anArray.length; x++ ) {
anArray[x] = random.nextInt(1000) + 1;;
}
return anArray;
}
public static void quickSort( int anArray[], int position, int pivot) {
if( position < pivot ) {
int q = partition(anArray, position, pivot);
quickSort(anArray, position, q-1);
quickSort(anArray, q+1, pivot);
}
}
public static int partition(int anArray[], int position, int pivot ) {
int x = anArray[pivot];
int i = position - 1;
for(int j = position; j < (pivot-1); j++ ) {
comparisons++;
if(anArray[j] <= x) {
i = i + 1;
int temp = anArray[i];
anArray[i] = anArray[j];
anArray[j] = temp;
}
}
int temp = anArray[i+1];
anArray[i+1] = anArray[pivot];
anArray[pivot] = temp;
return i+1;
}
}
I've moved the old, now irrelevant answer to the end.
Edit x2
Aha! I think I've found the cause of your horrible performance. You told us you were using randomized data. That is true. But what you didn't tell us is that you were using such a small range of possible random values.
For me, your code is very performant if you change this line:
anArray[x] = random.nextInt(1000) + 1;
to this:
anArray[x] = random.nextInt();
That goes against expectations, right? It should be cheaper to sort a smaller range of values, since there should be less swaps we need to do, right? So why does this happen? This happens because you have so many elements with the same value (on average, 100 thousand). So why does this lead to such horrible performance? Well, say at each point you chose a perfect pivot value: exactly halfway. Here's what it would look like:
1000 - Pivot: 500
- 500+ - Pivot: 750
- 750+ - Pivot: 875
- 750- - Pivot: 625
- 500- - Pivot: 250
And so on. However (and here's the critical part) you would eventually get to a partition operation where every single value is equal to the partition value. In other words, there will be a a big (100 thousand big) block of numbers with the same value that you will try to recursively sort. And how will that happen? It will recurse 100 thousand times, only removing the single pivot value at each level. In other words, it will partition everything to the left or everything to the right.
Expanding on the breakdown above, it would look kind of like this (I've used 8--a power of 2--for simplicity, and forgive the bad graphical representation)
Depth Min Max Pvt NumElements
0 0 7 4 100 000 000
1 0 3 2 50 000 000
2 0 1 1 25 000 000
3 0 0 0 12 500 000 < at this point, you're
4 0 0 0 12 499 999 < no longer dividing and
5 0 0 0 12 499 998 < conquering effectively.
3 1 1 1 12 500 000
4 1 1 1 12 499 999
5 1 1 1 12 499 998
2 2 3 3 25 000 000
3 ...
3 ...
1 4 7 6 50 000 000
2 4 5 5 25 000 000
3 ...
3 ...
2 6 7 7 25 000 000
3 ...
3 ...
If you want to counter this, you need to optimize your code to reduce the effects of this. More on that to come (I hope)...
...and continued. An easy way to solve your problem is to check if the array is already sorted at each step.
public static void quickSort(int anArray[], int position, int pivot) {
if (isSorted(anArray, position, pivot + 1)) {
return;
}
//...
}
private static boolean isSorted(int[] a, int start, int end) {
for (int i = start+1; i < end; i++) {
if (a[i] < a[i-1]) {
return false;
}
}
return true;
}
Add that and you won't recurse unnecessarily and you should be golden. In fact, you get better performance than you do with values randomized over all 32 bits of the integer.
Old answer (for posterity only)
Your partitioning logic looks really suspect to me. Let's extract and ignore the swap logic. Here's what you have:
int i = position - 1;
for(int j = position; j < pivot; j++ ) {
if(anArray[j] <= x) {
i = i + 1;
swap(anArray, i, j);
}
}
I fail to see how this would work at all. For example, if the very first value were less than the pivot value, it would be swapped with itself?
I think you want something like this (just a rough sketch):
for ( int i = 0, j = pivot - 1; i < j; i++ ) {
if ( anArray[i] > pivotValue ) {
//i now represents the earliest index that is greater than the pivotValue,
//so find the latest index that is less than the pivotValue
while ( anArray[j] > pivotValue ) {
//if j reaches i then that means that *all*
//indexes before i/j are less than pivot and all after are greater
//and so we should break out here
j--;
}
swap(anArray, i, j);
}
}
//swap pivot into correct position
swap(anArray, pivot, j+1);
Edit
I think I understand the original partitioning logic now (I had confused the if-block to be looking at elements greater than the pivot). I'll leave my answer up on the off chance that it delivers better performance but I doubt it would make a significant difference.
Beeing a c# guy I just pasted the above code in an empty c# project.
It took 35 seconds to complete for an array of 100.000.000 integers.
There seems to be nothing wrong with the code, there must be something else in your environment. Is the Java process allowed to allocate ~800 MB of RAM?
What happens if you lower the array size to 10.000.000. Do you get close to ~3 seconds then?
Is there a certain array size where the sort suddenly get slow?
Edit
I'm almost certain that you don't have a random array, you have probably failed with your random initialization.
If you create a new Random object for each element you will typically get the same value for each element since each initialization of Random seeds the random generator with the current time in milliseconds. If the whole array gets initialized within the same millisecond all elements gets the same value.
In c# I initialize like this
Random r = new Random();
var intArr = (from i in Enumerable.Range(0, 10000)
select r.Next()).ToArray();
var sw = System.Diagnostics.Stopwatch.StartNew();
quickSort(intArr, 0, intArr.Length - 1);
sw.Stop();
This takes 2 milliseconds to sort.
If I reinitialize my Random object for each element
var intArr = (from i in Enumerable.Range(0, 10000)
select (new Random()).Next()).ToArray();
I takes 300 milliseconds to sort because all the elements in the array gets the same value.

Sum of numbers under 10,000 that are multiples of 3, 5 or 7 in Java

I know how to get the program to add up the sums of the multiple for each of 3, 5 and 7, but I'm not sure how I'd get the program to only use each number once. For example, I can get the program to find out all of the numbers and add them up for 3 and then do the same for 5, but then the number 15 would be in the final number twice. I'm not sure exactly how I'd get it to only take the number once. Thanks for any help.
While the generate-and-test approach is simple to understand, it is also not very efficient if you want to run this on larger numbers. Instead, we can use the inclusion-exclusion principle.
The idea is to first sum up too many numbers by looking at the multiples of 3, 5 and 7 separately. Then we subtract the ones we counted twice, i.e. multiples of 3*5, 3*7 and 5*7. But now we subtracted too much and need to add back the multiples of 3*5*7 again.
We start by finding the sum of all integers 1..n which are multiples of k. First, we find out how many there are, m = n / k, rounded down thanks to integer division. Now we just need to sum up the sequence k + 2*k + 3*k + ... + m*k. We factor out the k and get k * (1 + 2 + ... + m).
This is a well-known arithmetic series, which we know sums to k * m * (m + 1)/2 (See triangle number).
private long n = 9999;
private long multiples(long k) {
long m = n / k;
return k * m * (m + 1) / 2:
}
Now we just use inclusion-exclusion to get the final sum:
long sum = multiples(3) + multiples(5) + multiples(7)
- multiples(3*5) - multiples(3*7) - multiples(5*7)
+ multiples(3*5*7);
This will scale much better to larger n than just looping over all the values, but beware of overflow and change to BigIntegers if necessary.
The easiest approach would be to use a for loop thus:
int sum = 0;
for(int i=1; i<10000; i++)
{
if (i % 3 == 0 || i % 5 == 0 || i % 7 == 0)
sum += i;
}
Use a Set to store the unique multiples, and then sum the values of the Set.
I would use a Set. This way you are guaranteed that you won't get any duplicates if they are your main problem.
One simple solution would be to add each number thats a multiple of 3,5, or 7 to an Answer list. And then as you work thru each number, make sure that its not already in the answer list.
(pseudo-code)
List<int> AnswerList;
List<int> MultiplesOfFive;
List<int> MultiplesOfSeven;
List<int> MultiplesOfThree;
for (int i = 0 ; i < 10000; i++)
{
if ( i % 3 == 0 && AnswserList.Contains(i) == false)
{
MultiplesOfThree.Add(i);
AnswerList.Add(i);
}
if ( i % 5 == 0 && AnswserList.Contains(i) == false)
{
MultiplesOfFive.Add(i);
AnswerList.Add(i);
}
if ( i % 7 == 0 && AnswserList.Contains(i) == false)
{
MultiplesOfSeven.Add(i);
AnswerList.Add(i);
}
}
for the solution that loops 1 to 1000 use i<=10000 otherwise it'll skip 10000 itself which is a multiple of 5. Apologies, for some reason i can't post this as a comment

Categories