Merge Sorting In Place - java

I am working on a Merge Sort method which sorts elements in-place without allocating extra memory. However, it isn't working as of now and I was wondering if anyone could help me, as I understand it is a relatively simple operation. Thank you in advance.
My merge sort method:
RegisterClass[] runMergeSort()
{
RegisterClass[] mergeSorted = new RegisterClass[capacity];
for (int counter = 0; counter < size; counter++)
{
mergeSorted[counter] = registerArray[counter];
}
runMergeSortHelper(mergeSorted, 0, size - 1);
return mergeSorted;
}
My helper method, which uses recursion:
private void runMergeSortHelper(RegisterClass[] workingArray,
int lowIndex,
int highIndex)
{
int midIndex = (lowIndex + highIndex) / 2;
if (lowIndex < highIndex)
{
runMergeSortHelper(workingArray, lowIndex, midIndex);
runMergeSortHelper(workingArray, midIndex+1, highIndex);
runMerge(workingArray, lowIndex, midIndex, highIndex);
}
}
And finally, my Merge method, which SHOULD be putting everything into order, however, it only does this partially.
private void runMerge(RegisterClass[] workingArray,
int lowIndex,
int midIndex,
int highIndex)
{
int counterJay = midIndex;
int counterAye = lowIndex;
int counterKay = lowIndex - 1;
while (counterAye < midIndex && counterJay <= highIndex)
{
counterKay++;
if (workingArray[counterAye].compareTo(workingArray[counterJay]) <= -1)
{
counterAye++;
}
else
{
swapValues(workingArray, counterAye, counterJay);
counterJay++;
counterAye++;
}
}
while (counterAye < midIndex)
{
counterKay++;
swapValues(workingArray, counterAye, counterKay);
counterAye++;
}
while (counterJay <= highIndex)
{
counterKay++;
swapValues(workingArray, counterJay, counterKay);
counterJay++;
}
}
Any advice at all would much be appreciated. I've looked online but nothing seems to help. Please do not refer me to a solution which is NOT an in-place solution.

Swapping isn't going to work with the logic used by the merge function. When a swap occurs, the element that is swapped from the left side to the right side is now out of order, and will be less than (or equal to) all of the remaining elements on the left side.
Whenever an element on the right side is found to be less than an element on the left side, a right rotate of that part of the array is needed to put the right side element into place.
Without resorting to a more complicated implementation, a small optimization can be made by scanning the right side for k = number of leading elements less than the current element on the left side, then do a rotate right by k elements. For random data, this won't help much, but for reverse sorted data, it would help quite a bit.

Related

How to improve a recursive sorting algorithm that is already O(n)?

I have this recursive sorting algorithm I'm using for an assignment, and my teacher said that there's an easy way to improve the running time of my algorithm... But I can't figure out what it is, at all. Unless I'm mistaken, the complexity for the algorithm is O(n)? I'm not sure since we didn't learn how to calculate the complexity of recursive methods in class. Here's the code:
public static void MyAlgorithm(int[] A, int n){
boolean done = true;
int j = 0;
while (j <= n - 2){
if (A[j] > A[j + 1]) {
swap(A,j,j+1);
done= false;
}
j++;
}
j = n - 1;
while (j >= 1){
if (A[j] < A[j - 1]) {
swap(A,j-1,j);
done=false;
}
j--;
}
if (!done)
MyAlgorithm(A, n);
else
return;
}
The only thing I can think of would be adding a if(done) return; after the first loop but it only saves the program from doing a few other operations. Oh and the swap method is basically just:
public static void swap(int[] arr, int pos1, int pos2){
int temp = arr[pos1];
arr[pos1] = arr[pos2];
arr[pos2] = temp;
}
Thank you in advance.
To start, no sorting algorithm can be performed in O(n) using comparisons. As a general rule, all sorting algorithms take AT LEAST O(n*log(n)) time.
The sort you appear to be using is something akin to the cocktail shaker sort, or bidirectional bubble sort. It runs in O(n^2) time. You should definitely research the methods you use and consider why you use them, and also learn how to properly classify things in big O notation.
I imagine your teacher means that you should call the sort as MyAlgorithm(a, n-1). Notice how in your first loop it goes through the entire array? This means that the last element will already be sorted when that loop exits. Similarly, you could add a start index and increment it each time. For example, revised code:
public static void MyAlgorithm(int[] A, int start, int n){
boolean done = true;
int j = start;
while (j <= n - 2){
if (A[j] > A[j + 1]) {
swap(A,j,j+1);
done= false;
}
j++;
}
j = n - 1;
while (j >= start+1){
if (A[j] < A[j - 1]) {
swap(A,j-1,j);
done=false;
}
j--;
}
if (!done)
MyAlgorithm(A, start+1, n-1);
else
return;
}
Then you can call this using: MyAlgorithm(my_array, 0, my_array.length)
Keep in mind that this is still not a fantastic sorting algorithm, and if you ever need to sort large amount of data, you should consider using something faster.

Creating quicksort without recursion and stack

I have a task to write quicksort (on only posivite numbers) algorythm in Java (I can't use any imports but Scanner) but without recursion and without stack.
I have two question about it :
I do understeand iterative quicksort with stack and recursive version but i cannot imagine how to do it without it.
I have heard about some 'in place' implementation but i dont really get it - is it solution for my problem?
I would appreciate if anyone could show me a way to do it ( dont post implementation if you can, I just want to understeand it not copy someone's code) or recommend some book where I can find it ( or some similar problem ).
Is implementing sort by insertion for some small arrays a good idea? If so how big should be N in this code :
if (arraySize < N)
insertionSort
else
quickSort
fi
Apparently my task was to find only posivite numbers, here is my solution:
public static void quickSort(final int size) {
int l = 0;
int r = size - 1;
int q, i = 0;
int tmpr = r;
while (true) {
i--;
while (l < tmpr) {
q = partition(l, tmpr);
arr[tmpr] = -arr[tmpr];
tmpr = q - 1;
++i;
}
if (i < 0)
break;
l++;
tmpr = findNextR(l, size);
arr[tmpr] = -arr[tmpr];
}
}
private static int findNextR(final int l, final int size) {
for (int i = l; i < size; ++i) {
if (arr[i] < 0)
return i;
}
return size - 1;
}
private static int partition(int l, int r) {
long pivot = arr[(l + r) / 2];
while (l <= r) {
while (arr[r] > pivot)
r--;
while (arr[l] < pivot)
l++;
if (l <= r) {
long tmp = arr[r];
arr[r] = arr[l];
arr[l] = tmp;
l++;
r--;
}
}
return l;
}
My array to sort is an static array in my class.
It is based on finding and creating negative numbers.
Partition is created by using middle element in array but using median is also good (it depends on array).
I hope someone will find this usefull.
Just as a reference the Java8 implementation of Arrays.sort(int[]) uses a threshold of 47, anything less than that is sorted using insertion. Their quick sort implementation is however very complex with some initial overhead, so look upon 47 as an upper limit.
A Google of "non-recursive quicksort" produced a slew of answers ... including this one: Non recursive QuickSort "Your language may vary," but the basic principle won't.
I personally think that, if you're going to sort something, you might as well use Quicksort in all cases . . .
Unless, of course, you can simply use a sort() function in your favorite target-language and leave it to the language implementors to have chosen a clever algorithm (uhhhh, it's probably Quicksort...) for you. If you don't have to specify an algorithm to do such a common task, "don't!" :-)

heapsort working 99%

was wondering if i could get some quick help with a heapsort implementation. I have it working and sorting fine but in the output it is always everything is sorted except the first number. It's probably just a check somewhere but i have gone over my code and tried changing values but nothing produced the results i needed. Any advice to where i went wrong?
here is my source code:
code removed, problem was solved!
thanks guys!
private static void movedown(double [] a, int k, int c) {
while (2*k <= c-1) {
int j = 2*k+1;
if (j <= c-1 && less(a[j], a[j+1])) j++;
if (!less(a[k], a[j])) break;
exch(a, k, j);
k = j;
}
}
public static void heapsort(double [] a, int count) {
for (int k = count/2; k >= 0; k--)
movedown(a, k, count);
while (count >= 1) {
exch(a, 0, count--);
movedown(a, 0, count);
}
}
I have fixed your bug and tested it on my machine. It should work. Just a couple minor changes in these two methods.
To summarize what you didn't get right:
In heapsort method, the count you passed in is zero-based index. However, when you built the heap you only looped to k = 1, i.e., one more iteration to go.
In movedown method, you should have known the left child index is 2*k+1 while the right child index is 2*k+2.
That you didn't keep consistent with your indexing choices(i.e., 0-based vs. 1-based) resulted in the bug I guess.

QuickSort vs MergeSort, what am I doing wrong?

I am trying to implement several sorting algorithms in Java, to compare the performances. From what I've read, I was expecting quickSort to be faster than mergeSort, but on my code it is not, so I assume there must be a problem with my quickSort algorithm:
public class quickSortExample{
public static void main(String[] args){
Random gen = new Random();
int n = 1000000;
int max = 1500000;
ArrayList<Integer> d = new ArrayList<Integer>();
for(int i = 0; i < n; i++){
d.add(gen.nextInt(max));
}
ArrayList<Integer> r;
long start, end;
start = System.currentTimeMillis();
r = quickSort(d);
end = System.currentTimeMillis();
System.out.println("QuickSort:");
System.out.println("Time: " + (end-start));
//System.out.println(display(d));
//System.out.println(display(r));
}
public static ArrayList<Integer> quickSort(ArrayList<Integer> data){
if(data.size() > 1){
int pivotIndex = getPivotIndex(data);
int pivot = data.get(pivotIndex);
data.remove(pivotIndex);
ArrayList<Integer> smallers = new ArrayList<Integer>();
ArrayList<Integer> largers = new ArrayList<Integer>();
for(int i = 0; i < data.size(); i++){
if(data.get(i) <= pivot){
smallers.add(data.get(i));
}else{
largers.add(data.get(i));
}
}
smallers = quickSort(smallers);
largers = quickSort(largers);
return concat(smallers, pivot, largers);
}else{
return data;
}
}
public static int getPivotIndex(ArrayList<Integer> d){
return (int)Math.floor(d.size()/2.0);
}
public static ArrayList<Integer> concat(ArrayList<Integer> s, int p, ArrayList<Integer> l){
ArrayList<Integer> arr = new ArrayList<Integer>(s);
arr.add(p);
arr.addAll(l);
return arr;
}
public static String display(ArrayList<Integer> data){
String s = "[";
for(int i=0; i < data.size(); i++){
s += data.get(i) + ", ";
}
return (s+"]");
}
}
Results (on 1 million integer between 0 and 1500000):
mergeSort (implemented with arrayList too): 1.3sec (average) (0.7sec with int[] instead)
quickSort: 3sec (average)
Is it just the choice of my pivot which is bad, or are there some flaws in the algo too.
Also, is there a faster way to code it with int[] instead of ArrayList()? (How do you declare the size of the array for largers/smallers arrays?)
PS: I now it is possible to implement it in an inplace manner so it uses less memory, but this is not the point of this.
EDIT 1: I earned 1 sec by changing the concat method.
Thanks!
PS: I now it is possible to implement it in an inplace manner so it uses less memory, but this is not the point of this.
It's not just to use less memory. All that extra work you do in the "concat" routine instead of doing a proper in-place QuickSort is almost certainly what's costing so much. If you can use extra space anyway, you should always code up a merge sort because it'll tend to do fewer comparisons than a QuickSort will.
Think about it: in "concat()" you inevitably have to make another pass over the sub-lists, doing more comparisons. If you did the interchange in-place, all in a single array, then once you've made the decision to interchange two places, you don't make the decision again.
I think the major problem with your quicksort, like you say, is that it's not done in place.
The two main culprits are smallers and largers. The default size for an ArrayList is 10. In the initial call to quickSort a good pivot will mean that smallers and largers grow to 500,000. Since the ArrayList only doubles in size when it reaches capacity, it will have to be resized at around 19 times.
Since you are make a new smaller and larger with each level of recursion your going to be performing approximately 2*(19+18+...+2+1) resizes. That's around 400 resizes the ArrayList objects have to perform before they are even concatenated. The concatenation process will probably perform a similar number of resizes.
All in all, this is a lot of extra work.
Oops, just noticed data.remove(pivotIndex). The chosen pivot index (middle of the array) is also going to be causing additional memory operations (even though middle is usual a better choice than beginning or end or the array). That is arraylist will copy the entire block of memory to the 'right' of the pivot one step to the left in the backing array.
A quick note on the chosen pivot, since the integers you are sorting are evenly distributed between n and 0 (if Random lives up to its name), you can use this to choose good pivots. That is, the first level of quick sort should choose max*0.5 as its pivot. The second level with smallers should choose max*0.25 and the second level with largers should choose max*0.75 (and so on).
I think, that your algo is quite inefficient because you're using intermediate arrays = more memory + more time for allocation/copy. Here is the code in C++ but the idea is the same: you have to swap the items, and not copy them to another arrays
template<class T> void quickSortR(T* a, long N) {
long i = 0, j = N;
T temp, p;
p = a[ N/2 ];
do {
while ( a[i] < p ) i++;
while ( a[j] > p ) j--;
if (i <= j) {
temp = a[i]; a[i] = a[j]; a[j] = temp;
i++; j--;
}
} while ( i<=j );
if ( j > 0 ) quickSortR(a, j);
if ( N > i ) quickSortR(a+i, N-i);
}
Fundamentals of OOP and data structures in Java By Richard Wiener, Lewis J. Pinson lists quicksort as following, which may or may not be faster (I suspect it is) than your implementation...
public static void quickSort (Comparable[] data, int low, int high) {
int partitionIndex;
if (high - low > 0) {
partitionIndex = partition(data, low, high);
quickSort(data, low, partitionIndex - 1);
quickSort(data, partitionIndex + 1, high);
}
}
private static int partition (Comparable[] data, int low, int high) {
int k, j;
Comparable temp, p;
p = data[low]; // Partition element
// Find partition index(j).
k = low;
j = high + 1;
do {
k++;
} while (data[k].compareTo(p) <= 0 && k < high);
do {
j--;
} while (data[j].compareTo(p) > 0);
while (k < j) {
temp = data[k];
data[k] = data[j];
data[j] = temp;
do {
k++;
} while (data[k].compareTo(p) <= 0);
do {
j--;
} while (data[j].compareTo(p) > 0);
}
// Move partition element(p) to partition index(j).
if (low != j) {
temp = data[low];
data[low] = data[j];
data[j] = temp;
}
return j; // Partition index
}
I agree that the reason is unnecessary copying. Some more notes follow.
The choice of pivot index is bad, but it's not an issue here, because your numbers are random.
(int)Math.floor(d.size()/2.0) is equivalent to d.size()/2.
data.remove(pivotIndex); is unnecessary copying of n/2 elements. Instead, you should check in the following loop whether i == pivotIndex and skip this element. (Well, what you really need to do is inplace sort, but I'm just suggesting straightforward improvements.)
Putting all elements that are equal to pivot in the same ('smaller') part is a bad idea. Imagine what happens when all elements of the array are equal. (Again, not an issue in this case.)
for(i = 0; i < s.size(); i++){
arr.add(s.get(i));
}
is equivalent to arr.addAll(s). And of course, unnecessary copying here again. You could just add all elements from the right part to the left one instead of creating new list.
(How do you declare the size of the array for largers/smallers arrays?)
I'm not sure if I got you right, but do you want array.length?
So, I think that even without implementing in-place sort you can significantly improve performance.
Technically, Mergesort has a better time-behavior ( Θ(nlogn) worst and average cases ) than Quicksort ( Θ(n^2) worst case, Θ(nlogn) average case). So it is quite possible to find inputs for which Mergesort outperforms Quicksort. Depending on how you pick your pivots, you can make the worst-case rare. But for a simple version of Quicksort, the "worst case" will be sorted (or nearly sorted) data, which can be a rather common input.
Here's what Wikipedia says about the two:
On typical modern architectures,
efficient quicksort implementations
generally outperform mergesort for
sorting RAM-based arrays. On the other
hand, merge sort is a stable sort,
parallelizes better, and is more
efficient at handling slow-to-access
sequential media.[citation needed]
Merge sort is often the best choice
for sorting a linked list: in this
situation it is relatively easy to
implement a merge sort in such a way
that it requires only Θ(1) extra
space, and the slow random-access
performance of a linked list makes
some other algorithms (such as
quicksort) perform poorly, and others
(such as heapsort) completely
impossible.

Quicksort with insertion Sort finish - where am I going wrong?

I am working on a project for a class. We are to write a quick-sort that transitions to a insertion sort at the specified value. Thats no problem, where I am now having difficulty is figuring out why I am not getting the performance I expect.
One of the requirements is that it must sort an array of 5,00,000 ints in under 1,300 ms (this is on standard machines, so CPU speed is not an issue). First of all, I can't get it to work on 5,000,000 because of a stack overflow error (too many recursive calls...). If I increase the heap size, I am still getting a lot slower than that.
Below is the code. Any hints anyone?
Thanks in advance
public class MyQuickSort {
public static void sort(int [] toSort, int moveToInsertion)
{
sort(toSort, 0, toSort.length - 1, moveToInsertion);
}
private static void sort(int[] toSort, int first, int last, int moveToInsertion)
{
if (first < last)
{
if ((last - first) < moveToInsertion)
{
insertionSort(toSort, first, last);
}
else
{
int split = quickHelper(toSort, first, last);
sort(toSort, first, split - 1, moveToInsertion);
sort(toSort, split + 1, last, moveToInsertion);
}
}
}
private static int quickHelper(int[] toSort, int first, int last)
{
sortPivot(toSort, first, last);
swap(toSort, first, first + (last - first)/2);
int left = first;
int right = last;
int pivotVal = toSort[first];
do
{
while ( (left < last) && (toSort[left] <= pivotVal))
{
left++;
}
while (toSort[right] > pivotVal)
{
right--;
}
if (left < right)
{
swap(toSort, left, right);
}
} while (left < right);
swap(toSort, first, right);
return right;
}
private static void sortPivot(int[] toSort, int first, int last)
{
int middle = first + (last - first)/2;
if (toSort[middle] < toSort[first]) swap(toSort, first, middle);
if (toSort[last] < toSort[middle]) swap(toSort, middle, last);
if (toSort[middle] < toSort[first]) swap(toSort, first, middle);
}
private static void insertionSort(int [] toSort, int first, int last)
{
for (int nextVal = first + 1; nextVal <= last; nextVal++)
{
int toInsert = toSort[nextVal];
int j = nextVal - 1;
while (j >= 0 && toInsert < toSort[j])
{
toSort[j + 1] = toSort[j];
j--;
}
toSort[j + 1] = toInsert;
}
}
private static void swap(int[] toSort, int i, int j)
{
int temp = toSort[i];
toSort[i] = toSort[j];
toSort[j] = temp;
}
}
I haven't tested this with your algorithm, and I don't know what kind of data set you're running with, but consider choosing a better pivot than the leftmost element. From Wikipedia on Quicksort:
Choice of pivot In very early versions
of quicksort, the leftmost element of
the partition would often be chosen as
the pivot element. Unfortunately, this
causes worst-case behavior on already
sorted arrays, which is a rather
common use-case. The problem was
easily solved by choosing either a
random index for the pivot, choosing
the middle index of the partition or
(especially for longer partitions)
choosing the median of the first,
middle and last element of the
partition for the pivot
Figured it out.
Actually, not my sorts fault at all. I was generating numbers between the range of 0-100 (for testing to make sure it was sorted). This resulted in tons of duplicates, which meant way to many partitions. Changing the range to min_int and max_int made it go a lot quicker.
Thanks for your help though :D
When the input array is large, its natural to expect that recursive functions run into stack overflow issues. which is what is happening here when you try with the above code. I would recommend you to write iterative Quicksort using your own stack. It should be fast because there is no stack frame allocations/deallocations done at run time. You won't run into stack overflow issues also. Performance also depends on at what point you are running insertion sort. I don't have a particular input size where insertion sort performs badly compared to quicksort. I would suggest you to try with different sizes and I'm sure you will notice difference.
You might also want to use binary search in insertion sort to improve performance. I don't know how much it improves when you run on smaller input but its a nice trick to play.
I don't want to share code because that doesn't make you learn how to convert recursive quicksort to iterative one. If you have problems in converting to iterative one let me know.

Categories