Why adding a criteria in for loop significantly improves the speed - java

I'm writing Shellsort in Java and found that adding a criteria in a for loop significantly improves the speed. Can someone explain why?
This is the fast code, 80ms for 10K Doubles.
public static void sort(Comparable[] a) {
if (a.length <= 1) {
return;
}
// Using 3K+1 starting from < N/3 as in the book
int magic = 1;
while (magic < a.length / 3) {
magic = 3 * magic + 1;
}
while (magic >= 1) {
for (int i = magic; i < a.length; i += 1) {
if (less(a[i - magic], a[i])) {
// Already in good order
continue;
}
for (int j = i; j >= magic && less(a[j], a[j - magic]); j -= magic) {
// if (less(a[j], a[j - magic])) {
exch(a, j, j - magic);
// }
}
/*
for (int j = 0; j < i; j += 1) {
if (less(a[i], a[j])) {
// j is the right place
// Use a series of exchanges to avoid creating new arrays
for (int k = i; k > j; k -= 1) {
exch(a, k-1, k);
}
break;
}
}
*/
}
magic /= 3;
}
}
The slow version (I'll just put the inner for loop here) which takes around 43,000ms
for (int j = i; j >= magic; j -= magic) {
if (less(a[j], a[j - magic])) {
exch(a, j, j - magic);
}
}
Please note that the less function simply check whether a[j] is smaller than a[j-magic].
From what I understand, in the fast code, we still check the less criteria for each loop, and if it doesn't satisfy we don't go inside the loop. In the slow version, we do go inside each loop and even less is not satisfied we don't do the exchange. What I don't understand is why the fast code is SO much faster? And is it the same for C++? (I could test the C++ part by myself)

Let's consider the slow version of the loop:
for (int j = i; j >= magic; j -= magic) {
if (less(a[j], a[j - magic])) {
exch(a, j, j - magic);
}
}
In this loop, we call less for each value of j, and for the values for which less return true, we call exch. This happens until j >= magic.
Now, let's look at the faster version:
for (int j = i; j >= magic && less(a[j], a[j - magic]); j -= magic) {
exch(a, j, j - magic);
}
In this version, we also call less for each value of j, but for the first value of j it returns false, the execution exits the loop. So, in this condition after the loop exits, it is not necessary that j < magic. As a result, many calls to exch and less are saved here, which results in optimization.
This will be the same for every language.

Related

Understanding Insertion Sort algorithm

I am learning the mechanics of Insertion Code. This is the original Insertion Code algorithm as described in several sites:
for (int i = 0; i < a.length; i++) {
int key = a[i];
int j = i;
while (j > 0 && a[j-1] > key) {
swap(a, j, j - 1);
j--;
}
a[j] = key;
}
The issue is while learning how it works, I realized that the next code without using a key, does exactly the same:
for (int i = 0; i < a.length; i++) {
int j = i;
while (j > 0 && a[j-1] > a[j]) {
swap(a, j, j - 1);
j--;
}
}
My question is, why is they key required in the original algorithm? is it needed for some edge cases that the second algorithm doesn't take into consideration? If so, which edge cases would them be?
After some testing the results of both algorithms perform the same amount of swaps, so I can't figure out what is the difference of having to use a key variable.

What are this java while loops doing in merge sort?

Step one and step two (step three) seem like repeatedly running to me. Why is it programmed like this?
int i = 0, j = 0;
int k = l;
while (i < n1 && j < n2) { ----step one
if (L[i] <= R[j]){
arr[k] = L[i];
i++;
}
else{
arr[k] = R[j];
j++;
}
k++;
}
while (i < n1){ ---step two
arr[k] = L[i];
i++;
k++;
}
while (j < n2){ ----step three
arr[k] = R[j];
j++;
k++;
}
}
"Step one" does the work of merging from two source arrays into the destination. When L or R is exhausted there may still be unmerged elements remaining in the other source array. "Step two" exists to copy any remaining elements from L to the destination. "Step three" serves the same purpose for R.
You can choose to skip those steps and just use a for loop, if that is easier for you in the following way:
for(int i = 0; i < arr.size(); i++) {
if(r >= right.size() || (l < left.size() && left[l] < right[r])) {
arr[i] = left[l++];
} else {
arr[i] = right[r++];
}
}
arr.size() = n1 + n2 in your implementation
or even this:
while(len--) {
if(r >= right.size() || (l < left.size() && left[l] < right[r])) {
arr[i] = left[l++];
} else {
arr[i] = right[r++];
}
}
where len = n1 + n2
I personally find this way more readable and easier, but to each their own! (this is unstable and can be made stable, but I leave that part for the reader to figure out!)
edit: I noticed it's java, maybe len-- won't work you can do len >= 0 and len-- inside the loop.

Counting the number of comparisons and moves made during sorting

I am doing insertion sort and was wondering if the number of comparisons made and number of moves made were calculated properly. Comparisons are the number of times two values were compared and moves are the number of elements moved, so a swap between numbers will be 2 moves.
public static int[] InsertionSort(int[] a) {
int j;
for(int i = 1; i < a.length; i++) {
int tmp = a[i];
for(j = i; j > 0 && (tmp < a[j-1]); j--) {
numCompares++;
a[j] = a[j-1];
numMoves++;
}
a[j] = tmp;
numMoves++;
}
return a;
}
The only problem here is that in inner loop condition j > 0 && (tmp < a[j-1]), actual comparison tmp < a[j-1] may result false, causing break of for loop, so numCompares++ which is located inside the loop will be skipped. To count comparisons precisely, small reformat is required:
for(j = i; j > 0; j--) {
numCompares++;
if (tmp >= a[j - 1])
break;
a[j] = a[j - 1];
numMoves++;
}

Infinite loop when printing an N x N table

Consider the following Java program:
public class RelativelyPrime {
public static void main(String[] args) {
int N = Integer.parseInt(args[0]); // Dimensions of grid
int i, j;
int r; // Remainder when i is divided by j
for (i = 1; i <= N; i++) {
for (j = 1; j <= N; j++) {
do { // Using Euclidean algorithm
r = i % j;
i = j;
j = r;
} while (r > 0);
if (i == 1) System.out.print("*");
else System.out.print(" ");
}
System.out.println();
}
}
}
This program prints an N x N table (or matrix, if you like) where N is a command-line argument.
The (i, j)-entry is a * if i and j are relatively prime, or a single whitespace if they are not relatively prime. When I run the program by entering, for instance, java RelativelyPrime 3 it endlessly prints *. Why is this happening?
You changed i and j in the while loop.
for (i = 1; i <= N; i++) {
for (j = 1; j <= N; j++) {
int ii = i, jj = j;
do { // Using Euclidean algorithm
r = ii % jj;
ii = jj;
jj = r;
} while (r > 0);
if (ii == 1) System.out.print("*");
else System.out.print(" ");
}
System.out.println();
}
This is where using the debugger would have helped you solve the problem.
Inside your loops, you alter both i and j which means they never reach N and thus you have an infinite loop.
I suggest you not alter these variables but instead use two new variables, ideally with meaningful names.

How can I remove a modulo out of my math operation?

My prof does not like the use of modulo since it's not efficient enough, but I'm not sure how else I can get the same answer using a logic operator or something. Can someone help me out with how I can do this?
j = (j + 1) % a.length;
This should do the trick.
int k = a.length;
int d = (j+1)/k;
j = (j+1) - d*k
The only way I can see of doing this without a modulo is still not great:
j = (++j < a.length)? j : (j - a.length);
Alternately, for more readability:
j++;
j = (j < a.length)? j : (j - a.length);
or
j++;
if (j >= a.length) {
j -= a.length;
}
Also, I'm not entirely sure about how Java does with loop prediction, but at least in C, the following would be slightly better for speed, if less readable, since the general assumption is that the argument to the if statement will be true, and j < a.length more often than not (Unless a.length <= 2, which seems unlikely.)
j++;
if(j < a.length) {
}
else {
j -= a.length;
}
If the initial value of j is outside the range 0 to a.length (inclusive-exclusive), then the only solutions either use a modulus or division, which, being the same operation, are the same speed, or a loop of subtraction, which will essentially accomplish the same thing as modulus on a very old processor, which is slower than the built in operation for modulus on any current processor I know about.
You could do this:
j = j + 1;
if (j >= a.length) {
j = j - a.length; // assumes j was less than length before increment
}
#ajp suggests another solution that actually would work ok.
j = j + 1;
if (j >= a.length) { // assumes j was less than length before increment
j = 0;
}
If I was writing the code, id write it this way, just in case. It has very little additional overhead and removes the "assumes"
j = j + 1;
while (j >= a.length) {
j = j - a.length;
}
Of course, the % would be a good way to do it too. Unless one is your professor.
This could be faster or slower than a divide/modulo depending on the cost of a jump (and any effect that has on the instruction pipeline/lookahead) and the efficiency of the integer division instructions.
Old processors would likely do better with the jump. More modern ones with the divide.
Think of what you are doing here. You are essentially saying:
if j + 1 is smaller than a.length, set j to j + 1
otherwise, we set j to a value smaller than a.length
This pseudocode should give you a very clear hint to the solution.

Categories