I am doing an algorithms class project in which we must modify an implementation of QuickSort with suggested improvements. One of these suggestions is as follows: Do not single out the pivot in the array and avoid the last swap of the partition method.
I'm having trouble understanding exactly what he means by this. Without a pivot, how is it even still QuickSort anymore? Any insight into what this could imply would be appreciated. This is the Java code to be modified.
public void quickSort() {
recQuickSort(0, nElems - 1);
}
public void recQuickSort(int left, int right) {
if (left >= right)
return;
long pivot = a[right];
int mid = partition(left, right, pivot);
recQuickSort(left, mid - 1);
recQuickSort(mid + 1, right);
} // end recQuickSort()
public void swap(int dex1, int dex2) { // swap two elements
long temp = a[dex1]; // A into temp
a[dex1] = a[dex2]; // B into A
a[dex2] = temp; // temp into B
} // end swap()
public int partition(int left, int right, long pivot) {
// assuming pivot == a[right]
int leftPtr = left - 1; // left of the first element
int rightPtr = right; // position of pivot
while (true) {
while (a[++leftPtr] < pivot)
; // find bigger
while (leftPtr < rightPtr && a[--rightPtr] >= pivot)
; // find smaller
if (leftPtr >= rightPtr) // if pointers cross,
break; // partition done
else
// not crossed, so
swap(leftPtr, rightPtr); // swap elements
} // end while(true)
swap(leftPtr, right); // restore pivot
return leftPtr; // return pivot location
} // end partition()
I'm not going to try to implement it for you, but my interpretation of this suggested "improvement" is that he wants you to still choose a pivot value and partition the array into sections according to which side of that value they're on, but not treat the array entry containing that value specially.
It shouldn't be hard to do, but it won't improve performance of the algorithm at all, and I doubt that it's much of an improvement in any other sense.
It looks like he doesn't want you to pivot on an actual element in the array. The point of the last swap in the partition() method is that he moves the pivot into the correct spot in the array (note that he never moves a pivot element after a partition() call).
Edit If you are confused about "not using a pivot", you are still using a pivot... it's just not a pivot in the array. Imagine doing the quicksort by hand, you can pick any arbitrary value to pivot with.
The issue is that that swap doesn't really negatively impact performance at all. On the other hand, this should be a quick change...
Related
Here is my code. It works fine, unless an array has duplicate elements. The quickSortRec method will keep halving arrays divided by a pivot. My paritition method will but a pivot-element in its sorted position. I believe my problem is in the partition method. But if I have >= or <= to my while statements, I get errors. What should I do?
public static void quickSortRec(int[] arr, int start, int end){
if (start<end){ // Recursive case: There is a sizable array to working ie the array has more than one element
int idx = partition(arr,start,end); // it returns the index of pivot which should be in its sorted position
System.out.println(arr[idx] );
System.out.println(Arrays.toString(arr));
quickSortRec(arr,start,idx-1); // quick sorts that half right of the pivot
quickSortRec(arr,idx+1,end); // quick sorts that half left of the pivot
}
}
public static int partition(int[] arr, int start, int end){
int pivot = arr[end]; // I choose the pivot to be the starting element
int i = end;
while (start < end){ // we will try to make everything smaller than the pivot left of it, and bigger right of it
while (arr[start] < pivot ){ // if an element left of the pivot is already smaller than it
start++; // we just move on
}
while(arr[end] > pivot){ // if an element right of the pivot is already bigger than it
end--; // we move on
}
if (start < end){ // we come here only when there is an element on the wrong side of the pivot
int temp = arr[start]; // swap the pivot and that element on the wrong side
arr[start] = arr[end];
arr[end] = temp;
//start++;
//end++;
//end--; // we move on and we don't move the start because that is the pivot index
}
}
arr[end] = pivot;
return end; // return pivot index which should be sorted
}
You almost had it right with changing them to >= and <=.
The only problem with setting both to be 'equal or' is that it won't clearly know if to put them above or below the pivot.
So the answer simply is only change one of them, so that the duplicates get sorted all onto one side.
For example like this:
while (arr[start] <= pivot ){ // if an element left of the pivot is already smaller than it
start++; // we just move on
}
while(arr[end] > pivot){ // if an element right of the pivot is already bigger than it
end--; // we move on
}
I tried to implement an efficient sorting algorithm in Java. For this reason, I also implemented quicksort and use the following code:
public class Sorting {
private static Random prng;
private static Random getPrng() {
if (prng == null) {
prng = new Random();
}
return prng;
}
public static void sort(int[] array) {
sortInternal(array, 0, array.length - 1);
}
public static void sortInternal(int[] array, int start, int end) {
if (end - start < 50) {
insertionSortInternal(array, start, end);
} else {
quickSortInternal(array, start, end);
}
}
private static void insertionSortInternal(int[] array, int start, int end) {
for (int i=start; i<end - 1; ++i) {
for (int ptr=i; ptr>0 && array[ptr - 1] < array[ptr]; ptr--) {
ArrayUtilities.swap(array, ptr, ptr - 1);
}
}
}
private static void quickSortInternal(int[] array, int start, int end) {
int pivotPos = getPrng().nextInt(end - start);
int pivot = array[start + pivotPos];
ArrayUtilities.swap(array, start + pivotPos, end - 1);
int left = start;
int right = end - 2;
while (left < right) {
while (array[left] <= pivot && left < right) {
++left;
}
if (left == right) break;
while (array[right] >= pivot && left < right) {
right--;
}
if (left == right) break;
ArrayUtilities.swap(array, left, right);
}
ArrayUtilities.swap(array, left, end - 1);
sortInternal(array, start, left);
sortInternal(array, left + 1, end);
}
}
ArrayUtilities.swap just swaps the two given elements in the array. From this code, I expect O(n log(n)) runtime behaviour. But, some different lengths of arrays to sort gave the following results:
10000 elements: 32ms
20000 elements: 128ms
30000 elements: 296ms
The test ran 100 times in each case, and then the arithmetic mean of the running times was calculated. But clearly, as opposed to the expected behaviour, the runtime is O(n^2). What's wrong with my algorithm?
In your insertion-sort implementation your array will be sorted in descending order, while in your quick-sort the array is sorted in ascending order. So replace(for descending order):
for (int ptr=i; ptr>0 && array[ptr - 1] < array[ptr]; ptr--)
with
for (int ptr=i; ptr>0 && array[ptr - 1] > array[ptr]; ptr--)
It also seems like your indexing is not correct.
Try to replace:
sortInternal(array, 0, array.length - 1);
with:
sortInternal(array, 0, array.length);
And in the insertions sort first for loop you don't need to do end - 1, i.e. use:
for (int i=start; i<end; ++i)
Finally, add if (start >= end) return; at the beginning of the quick-sort method.
And as #ljeabmreosn mentioned, 50 is a little bit too large, I would have chosen something between 5 and 20.
Hope that helps!
The QuickSort "optimized" with Insertion Sort for arrays with length less than 50 elements seems to be a problem.
Imagine I had an array of size 65, and the pivot happened to be the median of that array. If I ran the array through your code, your code would use Insertion Sort on the two 32 length subarrays to the left and right of the pivot. This would result in ~O(2*(n/2)^2 + n) = ~O(n^2) average case. Using quick sort and implementing a pivot picking strategy for the first pivot, the time average case would be ~O((nlog(n)) + n) = ~O(n(log(n) + 1)) = ~O(n*log(n)). Don't use Insertion Sort as it is only used when the array is almost sorted. If you are using Insertion Sort solely because of the real running time of sorting small arrays might run faster than the standard quick sort algorithm (deep recursion), you can always utilize a non-recursive quick sort algorithm which runs faster than Insertion Sort.
Maybe change the "50" to "20" and observe the results.
This question already has an answer here:
What is the purpose of these lines of swap code in quicksort application?
(1 answer)
Closed 8 years ago.
I am trying to understand the code in an application of quicksort to find the kth smallest element.
Here is the code that the author wrote
public class KthSmallest {
public static void main(String[] args) {
int[] test = {2,3,1,5,7,6,9};
System.out.println("4th smallest is " + quick_select(test, 4, 0, test.length - 1));
}
private static int quick_select(int[] a, int k, int left, int right) {
int pivot=findpivot(a,left,right);
if(pivot==k-1){
return a[pivot];
}
if(k-1<pivot){
return quick_select(a, k, left, pivot-1);
}
else {
return quick_select(a, k, pivot+1, right);
}
}
private static int findpivot(int[] a, int left, int right) {
int pivot = a[(left+right)/2];
while(left<right){
while(a[left]<pivot){
left++;
}
while(a[right]>pivot){
right--;
}
if(left<=right){
swap(a,left,right);
left++;
right--;
}
}
return left;
}
private static void swap(int[] a, int i, int j) {
int temp=a[i];
a[i]=a[j];
a[j]=temp;
}
}
I am trying to understand what the significance of this code segment in find pivot
if(left<=right){
swap(a,left,right);
left++;
right--;
}
Here is what you know. All the elements to the left of left are smaller than pivot. All the elements to the right of right of right are greater than pivot. Can anyone explain with that intuition why it is necessary to swap if right>= left?
The first loop moves left to the right until it finds an element that is greater than the pivot. The second loop moves right to the left until it finds an element less than the pivot. At this point a[left] should move after the pivot and a[right] should move before the pivot, the if takes care of that.
Can anyone explain with that intuition why it is necessary to swap if
right>= left?
right and left are indexes - not the items themselves (which is probably the source of your confusion). If a number to the "right" of the pivot is smaller than the pivot it should be moved to the left of the pivot.
Respectively, if a number to the left of the pivot is bigger than the pivot - it should be moved to the right (of the pivot).
Once we run into two such items, we use the if condition to make sure that we didn't go "too far" in the previous loops and that the numbers that were found really should be swapped, this check - tests that the indexes left <= right otherwise there's no reason to do the swap (or continue running...).
Note: The explanation in the last paragraph show that we don't really have to check if(left<=right) - it's good enough to check that if(left<right)
I have written a recursive method for a partition sort that sorts the array however when I use an array of more than 10-20 elements the program takes a really long time to complete (On my computer a bubble sort of a 100,000 int array will take about 15-20 seconds but with an array of only 30 ints my partition sort is taking around 45 seconds to be sorted.
Here is the code.
public static int[] partitionSortRecursive(int[] array, int beginning, int end)
{
if (end < beginning)
return array;
int pivot = (array[beginning] + array[end]) / 2;
int firstUnknown = beginning;
int lastS1 = beginning - 1;
int firstS3 = end + 1;
while (firstUnknown < firstS3)
{
if (array[firstUnknown] == pivot)
{
firstUnknown++;
}
else if (array[firstUnknown] > pivot)
{
firstS3--;
int temp = array[firstUnknown];
array[firstUnknown] = array[firstS3];
array[firstS3] = temp;
}
else
{
lastS1++;
int temp = array[firstUnknown];
array[firstUnknown] = array[lastS1];
array[lastS1] = temp;
firstUnknown++;
}
}
partitionSortRecursive(array, 0, lastS1);
partitionSortRecursive(array, firstS3, end);
return array;
}
You do not use the correct pivot element. You calculate the average of the left and right value but you have to take a sample value from the sub array to partition instead.
You may take the rightmost, the center or any other element. So your first line of codes should look like this
int pivot = array[(beginning + end) / 2];
// or
int pivot = array[end];
You could also take any other element (e.g. random)
EDIT: This does not solve the performance issue.
To my understanding, quick sort will divide an array into two sub arrays A and B where all elements in A are smaller than any element in B and then perform the same operation onto the two sub arrays.
So the basic call structure should be like this
void DoSort (array, i, j)
{
pivot = Partition (array, i, j)
DoSort (array, i,pivot)
DoSort (array, pivot + 1, j)
}
Put your implementation is basically
void DoSort (array, i, j)
{
pivot = Partition (array, i, j)
DoSort (array, 0, pivot) // <<<<<< notice the '0' instead of 'i'
DoSort (array, pivot + 1, j)
}
So you always start from the very beginning of the original array which will most likely take a while
Instead of direct recoursive call like this
partitionSortRecursive(array, 0, lastS1);
partitionSortRecursive(array, firstS3, end);
Organize internal stack where you can save index pairs. While the stack is not empty get the next pair from the stack. In the end of function don't call the same function but put in the stack 2 pairs (0, lastS1) and (firstS3, end)
I am working on a project for a class. We are to write a quick-sort that transitions to a insertion sort at the specified value. Thats no problem, where I am now having difficulty is figuring out why I am not getting the performance I expect.
One of the requirements is that it must sort an array of 5,00,000 ints in under 1,300 ms (this is on standard machines, so CPU speed is not an issue). First of all, I can't get it to work on 5,000,000 because of a stack overflow error (too many recursive calls...). If I increase the heap size, I am still getting a lot slower than that.
Below is the code. Any hints anyone?
Thanks in advance
public class MyQuickSort {
public static void sort(int [] toSort, int moveToInsertion)
{
sort(toSort, 0, toSort.length - 1, moveToInsertion);
}
private static void sort(int[] toSort, int first, int last, int moveToInsertion)
{
if (first < last)
{
if ((last - first) < moveToInsertion)
{
insertionSort(toSort, first, last);
}
else
{
int split = quickHelper(toSort, first, last);
sort(toSort, first, split - 1, moveToInsertion);
sort(toSort, split + 1, last, moveToInsertion);
}
}
}
private static int quickHelper(int[] toSort, int first, int last)
{
sortPivot(toSort, first, last);
swap(toSort, first, first + (last - first)/2);
int left = first;
int right = last;
int pivotVal = toSort[first];
do
{
while ( (left < last) && (toSort[left] <= pivotVal))
{
left++;
}
while (toSort[right] > pivotVal)
{
right--;
}
if (left < right)
{
swap(toSort, left, right);
}
} while (left < right);
swap(toSort, first, right);
return right;
}
private static void sortPivot(int[] toSort, int first, int last)
{
int middle = first + (last - first)/2;
if (toSort[middle] < toSort[first]) swap(toSort, first, middle);
if (toSort[last] < toSort[middle]) swap(toSort, middle, last);
if (toSort[middle] < toSort[first]) swap(toSort, first, middle);
}
private static void insertionSort(int [] toSort, int first, int last)
{
for (int nextVal = first + 1; nextVal <= last; nextVal++)
{
int toInsert = toSort[nextVal];
int j = nextVal - 1;
while (j >= 0 && toInsert < toSort[j])
{
toSort[j + 1] = toSort[j];
j--;
}
toSort[j + 1] = toInsert;
}
}
private static void swap(int[] toSort, int i, int j)
{
int temp = toSort[i];
toSort[i] = toSort[j];
toSort[j] = temp;
}
}
I haven't tested this with your algorithm, and I don't know what kind of data set you're running with, but consider choosing a better pivot than the leftmost element. From Wikipedia on Quicksort:
Choice of pivot In very early versions
of quicksort, the leftmost element of
the partition would often be chosen as
the pivot element. Unfortunately, this
causes worst-case behavior on already
sorted arrays, which is a rather
common use-case. The problem was
easily solved by choosing either a
random index for the pivot, choosing
the middle index of the partition or
(especially for longer partitions)
choosing the median of the first,
middle and last element of the
partition for the pivot
Figured it out.
Actually, not my sorts fault at all. I was generating numbers between the range of 0-100 (for testing to make sure it was sorted). This resulted in tons of duplicates, which meant way to many partitions. Changing the range to min_int and max_int made it go a lot quicker.
Thanks for your help though :D
When the input array is large, its natural to expect that recursive functions run into stack overflow issues. which is what is happening here when you try with the above code. I would recommend you to write iterative Quicksort using your own stack. It should be fast because there is no stack frame allocations/deallocations done at run time. You won't run into stack overflow issues also. Performance also depends on at what point you are running insertion sort. I don't have a particular input size where insertion sort performs badly compared to quicksort. I would suggest you to try with different sizes and I'm sure you will notice difference.
You might also want to use binary search in insertion sort to improve performance. I don't know how much it improves when you run on smaller input but its a nice trick to play.
I don't want to share code because that doesn't make you learn how to convert recursive quicksort to iterative one. If you have problems in converting to iterative one let me know.