Heapify method is not working properly for max heap - java

I am preparing for a data structures and algorithms final exam. I am trying to work through all of the data structures we have learnt this semester and program them by my self to help prepare me for the final. I am working on the max heap right now which includes a inserting (with a heapify) and a retrieve max. I am stuck on the heapifying/swapping of the parent and the child. It seems that the heapify is not working as I get back an array in the order of the way the numbers were inserted. Here is what I have so far.
private int getLeftChildIndex(int index)
{
return (2*index + 1);
}
private int getLeftChildValue(int index)
{
return heap[2*index + 1];
}
private int getRightChildIndex(int index)
{
return (2*index + 2);
}
private int getRightChildValue(int index)
{
return heap[2*index + 2];
}
private int getParentIndex(int index)
{
return ((int) Math.ceil((index - 2)/2));
}
private void swap(int child, int parent)
{
int temp = heap[parent];
heap[parent] = heap[child];
heap[child] = temp;
}
private void insert(int num)
{
heap[heapSize] = num;
heapSize++;
int index = heapSize - 1;
while (getParentIndex(index) > 0 && heap[index] > heap[getParentIndex(index)])
{
swap(index, getParentIndex(index));
index = getParentIndex(index);
}
}
public static void main(String[] args)
{
HeapTest heap = new HeapTest();
heap.insert(15);
heap.insert(5);
heap.insert(10);
heap.insert(30);
}
which just gives an array of the form [15,5,10,30] for example. This array for a max heap should be of the form [30,15,10,5]
Im expecting this: [15,5,10,30] -> [15,30,10,5] -> [30,15,10,5]
Could any one provide some insight as to why the heapify part of insert is not working?
Thanks!!

There are a couple of issues I can see.
consider what Math.ceil((index - 2)/2) would return for index 3. This would return 0 rather than 1 because the /2 is an integer operation. Changing that to /2.0 will fix it. Or even simpler would be to make use of integer arithmetic with (index - 1) / 2 which is clearer and more efficient.
getParentIndex(index) > 0 means that you are ignoring the root node at index 0 - currently your code will never swap that item. changing to >= will fix it.
There could well be other issues but those are 2 I can see by inspection. Either unit testing or interactive debugging would have uncovered those issues.

Related

Issues with Binary Search Algorithm

I have an assignment to write a binary search that returns the first iteration of the value we are looking for. I've been doing some research online and my search looks a lot like what i'm finding but i'm having an issue. If I pass this code an array that looks like this {10,5,5,3,2} it find the 5 at in the middle(The first thing it checks) and then just returns it. But that is not the first iteration of the 5 it is the second. What am I doing wrong? Is this even possible?
Thanks in advance!
The code(I'm using Java):
public static int binarySearch(int[] arr, int v){
int lo = 0;
int hi = arr.length-1;
while(lo <= hi){
int middle = (lo+hi)/2;
if(v == arr[middle]){
return middle;
}
else
{
if(v < arr[middle]){
lo = middle+1;
}
else
{
hi = middle-1;
}
}
}
return -1;
}
Here is a modified algorithm that works.
public static int binarySearch(int[] arr, int v) {
int lo = -1;
int hi = arr.length - 1;
while (hi - lo > 1 ) {
int middle = (lo + hi) / 2;
if (arr[middle] > v) {
lo = middle;
} else {
hi = middle;
}
}
if (v == arr[hi]) {
return hi;
} else {
return -1;
}
}
The key points are:
The interval (lo, hi] is exclusive to the left, inclusive to the right.
At each step we throw away one half of the interval. We stop when we are down to one element. Attempts to terminate early offer only a minimal performance boost, while they often affect code legibility and/or introduce bugs.
When arr[middle] = v we assign hi = middle, thus throwing away the right half. This is safe to do because we don't care any occurrences of v past middle. We do care about arr[middle], which may or may not be the first occurrence, and it is for this reason that we made (lo, hi] inclusive to the right. If there are occurrences of v before middle, we will find them in subsequent iterations.
As a side note, the more natural definition [0, n) inclusive to the left, exclusive to the right, can be used to find the last occurrence of v.
In my experience, this inclusive-exclusive interval definition produces the shortest, clearest and most versatile code. People keep trying to improve on it, but they often get tangled up in corner cases.

Different ways of using recursion in Java

I'm thinking of several elegant ways of writing a simple Lisp-like recursive function in Java that does, let's say, a simple summation.
In Common Lisp it would be like this:
(defun summation(l)
(if l
(+ (car l) (summation (cdr l)))
0))
(summation '(1 2 3 4 5)) ==> 15
In Java the one of many possible solutions would be:
public int summation(int[] array, int n) {
return (n == 0)
? array[0]
: array[n] + summation(array, n - 1);
}
CALL:
summation(new int[]{1,2,3,4,5}, 4); //15
1) Is there any possible way NOT to use the index n?
2) Or leave your solution (non-iterational) which you see as interesting.
Thanks.
Using Java Collections - something like this should give you an idea of how to eliminate n and recurse in terms of the list size instead:
public int summation( List<Integer> list ) {
return list.isEmpty()
? 0
: list.get( list.size - 1 ) + summation( list.subList( 0 , list.size() - 1 ) );
}
Cheers,
Usually, I solve this kind of recursion with a public API that does not require the index parameter and a private API with any signature I#d like it to be. For this I would separate it this way:
public int summation(int[] numbers) {
return summation(numbers, numbers.length - 1);
}
private int summation(int[] numbers, int till) {
return (till < 0) ? 0 : numbers[till] + summation(numbers, till - 1);
}
Note that you must check till < 0 as this handles an empty array correctly.
Another way would be to not use an array, but any Iterable<Integer>:
public int summation(Iterable<Integer> numbers) {
return summation(numbers.iterator());
}
private int summation(Iterator<Integer> numbers) {
return (numbers.hasNext()) ? numbers.next() + summation(numbers) : 0;
}
Hint: The order of calls in numbers.next() + summation(numbers) is important, as the next() call must be done first.
If you use List.subList method, it may perform iteration, underneath. You can use Queue instead, to avoid iteration. For example:
public int sum(Queue queue) {
return queue.isEmpty() ? 0 : (queue.poll() + sum(queue));
}
public class HelloWorld{
static int sum=0;
static int c;
public static void main(String []args){
int[] y={1,2,3,4,5};
c=y.length;
System.out.println( summation(y)); //15
}
public static int summation(int[] array) {
c--;
if(c<0){
return sum;
}
else{
sum+=array[c];
return summation(array);
}
}
}
Here's a simple method that seems pretty close to what's being asked for.Basically, we are taking a recursive approach to performing summation ascontrasted with brute force from the bottom up.
public static int sumToN(int n) {
if( n == 0 ){
return 0;
}
return n + sumToN(n - 1);
}

lo,hi indices in recursive binarysearch in java

For a classic binarySearch on an array of java Strings (say String[] a), which is the correct way of calling the search method? is it
binarySearch(a,key,0,a.length)
or
binarySearch(a,key,0,a.length-1)
I tried both for the below implementation,and both seems to work.. Is there a usecase where either of these calls can fail?
class BS{
public static int binarySearch(String[] a,String key){
return binarySearch(a,key,0,a.length);
//return binarySearch(a,key,0,a.length-1);
}
public static int binarySearch(String[] a,String key,int lo,int hi) {
if(lo > hi){
return -1;
}
int mid = lo + (hi - lo)/2;
if(less(key,a[mid])){
return binarySearch(a,key,lo,mid-1);
}
else if(less(a[mid],key)){
return binarySearch(a,key,mid+1,hi);
}
else{
return mid;
}
}
private static boolean less(String x,String y){
return x.compareTo(y) < 0;
}
public static void main(String[] args) {
String[] a = {"D","E","F","M","K","I"};
Arrays.sort(a);
System.out.println(Arrays.toString(a));
int x = binarySearch(a,"M");
System.out.println("found at :"+x);
}
}
Consider the case where
a = [ "foo" ]
and you search key "zoo" with binarySearch(a,key,0,a.length);
The code will search for it in interval[0,1], see it should be right than that,
next recursion searches interval [1,1], causing an indexing of a[1] at line
if(less(key,a[mid])){
resulting in a array out of bounds error.
The second solution will work fine.
I think the second approach will be safe.
Consider this case - you have an array of 9 elements and the key is situated at the last index (8-th element). Then you might have a method call like this if you follow the first approach -
binarySearch(a, key, 9, 9);
Now, in that method execution, the integer division in the following line will result in 9 -
int mid = 9 + (9 - 9)/2;
and you will be indexing your array with 9 in the next line -
if( less(key,a[mid]) ) { // You'll face ArrayIndexOutOfBoundException
....
}
which will be invalid and cause ArrayIndexOutOfBoundException.
The second approach however will be just fine.

Quicksort with insertion Sort finish - where am I going wrong?

I am working on a project for a class. We are to write a quick-sort that transitions to a insertion sort at the specified value. Thats no problem, where I am now having difficulty is figuring out why I am not getting the performance I expect.
One of the requirements is that it must sort an array of 5,00,000 ints in under 1,300 ms (this is on standard machines, so CPU speed is not an issue). First of all, I can't get it to work on 5,000,000 because of a stack overflow error (too many recursive calls...). If I increase the heap size, I am still getting a lot slower than that.
Below is the code. Any hints anyone?
Thanks in advance
public class MyQuickSort {
public static void sort(int [] toSort, int moveToInsertion)
{
sort(toSort, 0, toSort.length - 1, moveToInsertion);
}
private static void sort(int[] toSort, int first, int last, int moveToInsertion)
{
if (first < last)
{
if ((last - first) < moveToInsertion)
{
insertionSort(toSort, first, last);
}
else
{
int split = quickHelper(toSort, first, last);
sort(toSort, first, split - 1, moveToInsertion);
sort(toSort, split + 1, last, moveToInsertion);
}
}
}
private static int quickHelper(int[] toSort, int first, int last)
{
sortPivot(toSort, first, last);
swap(toSort, first, first + (last - first)/2);
int left = first;
int right = last;
int pivotVal = toSort[first];
do
{
while ( (left < last) && (toSort[left] <= pivotVal))
{
left++;
}
while (toSort[right] > pivotVal)
{
right--;
}
if (left < right)
{
swap(toSort, left, right);
}
} while (left < right);
swap(toSort, first, right);
return right;
}
private static void sortPivot(int[] toSort, int first, int last)
{
int middle = first + (last - first)/2;
if (toSort[middle] < toSort[first]) swap(toSort, first, middle);
if (toSort[last] < toSort[middle]) swap(toSort, middle, last);
if (toSort[middle] < toSort[first]) swap(toSort, first, middle);
}
private static void insertionSort(int [] toSort, int first, int last)
{
for (int nextVal = first + 1; nextVal <= last; nextVal++)
{
int toInsert = toSort[nextVal];
int j = nextVal - 1;
while (j >= 0 && toInsert < toSort[j])
{
toSort[j + 1] = toSort[j];
j--;
}
toSort[j + 1] = toInsert;
}
}
private static void swap(int[] toSort, int i, int j)
{
int temp = toSort[i];
toSort[i] = toSort[j];
toSort[j] = temp;
}
}
I haven't tested this with your algorithm, and I don't know what kind of data set you're running with, but consider choosing a better pivot than the leftmost element. From Wikipedia on Quicksort:
Choice of pivot In very early versions
of quicksort, the leftmost element of
the partition would often be chosen as
the pivot element. Unfortunately, this
causes worst-case behavior on already
sorted arrays, which is a rather
common use-case. The problem was
easily solved by choosing either a
random index for the pivot, choosing
the middle index of the partition or
(especially for longer partitions)
choosing the median of the first,
middle and last element of the
partition for the pivot
Figured it out.
Actually, not my sorts fault at all. I was generating numbers between the range of 0-100 (for testing to make sure it was sorted). This resulted in tons of duplicates, which meant way to many partitions. Changing the range to min_int and max_int made it go a lot quicker.
Thanks for your help though :D
When the input array is large, its natural to expect that recursive functions run into stack overflow issues. which is what is happening here when you try with the above code. I would recommend you to write iterative Quicksort using your own stack. It should be fast because there is no stack frame allocations/deallocations done at run time. You won't run into stack overflow issues also. Performance also depends on at what point you are running insertion sort. I don't have a particular input size where insertion sort performs badly compared to quicksort. I would suggest you to try with different sizes and I'm sure you will notice difference.
You might also want to use binary search in insertion sort to improve performance. I don't know how much it improves when you run on smaller input but its a nice trick to play.
I don't want to share code because that doesn't make you learn how to convert recursive quicksort to iterative one. If you have problems in converting to iterative one let me know.

Project Euler (P14): recursion problems

Hi I'm doing the Collatz sequence problem in project Euler (problem 14). My code works with numbers below 100000 but with numbers bigger I get stack over-flow error.
Is there a way I can re-factor the code to use tail recursion, or prevent the stack overflow. The code is below:
import java.util.*;
public class v4
{
// use a HashMap to store computed number, and chain size
static HashMap<Integer, Integer> hm = new HashMap<Integer, Integer>();
public static void main(String[] args)
{
hm.put(1, 1);
final int CEILING_MAX=Integer.parseInt(args[0]);
int len=1;
int max_count=1;
int max_seed=1;
for(int i=2; i<CEILING_MAX; i++)
{
len = seqCount(i);
if(len > max_count)
{
max_count = len;
max_seed = i;
}
}
System.out.println(max_seed+"\t"+max_count);
}
// find the size of the hailstone sequence for N
public static int seqCount(int n)
{
if(hm.get(n) != null)
{
return hm.get(n);
}
if(n ==1)
{
return 1;
}
else
{
int length = 1 + seqCount(nextSeq(n));
hm.put(n, length);
return length;
}
}
// Find the next element in the sequence
public static int nextSeq(int n)
{
if(n%2 == 0)
{
return n/2;
}
else
{
return n*3+1;
}
}
}
Your problem is not with the size of the stack (you're already memoizing the values), but with
the size of some of the numbers in the sequences, and
the upper limits of a 32-bit integer.
Hint:
public static int seqCount(int n)
{
if(hm.get(n) != null) {
return hm.get(n);
}
if (n < 1) {
// this should never happen, right? ;)
} ...
...
That should hopefully be enough :)
P.S. you'll run into a need for BigNums in a lot of project euler problems...
If you change from integer to long it will give you enough room to solve the problem.
Here was the code that I used to answer this one:
for(int i=1;i<=1000000;i+=2)
{
steps=1;
int n=i;
long current=i;
while(current!=1)
{
if(current%2==0)
{
current=current/2;
}else{
current=(current*3)+1;
}
steps++;
}
if(steps>best)
{
best=steps;
answer=n;
}
}
Brute forcing it, takes about 9 seconds to run
Side note (as it seems that you don't actually need tail call optimization for this problem): tail call optimization is not available in Java, and as far as I have heard, it is not even supported by the JVM bytecode. This means that any deep recursion is not possible, and you have to refactor it to use some other loop construct.
If you are counting the size of the Collatz sequence for numbers upto 1,000,000
you should re-consider using Integer type. I suggest using BigInteger or possible a long.
This should alleviate the problems encountered, but be warned you may still run out of heap-space depending on your JVM.
I think you need these 2 hints :
Don't use Integer because at some starting number, the sequence will fly into some numbers greater than Integer.Max_VALUE which is 2147483647. Use Long instead.
Try not to use recursion to solve this problem, even with memoization. As i mentioned earlier some numbers will fly high and produce a great deal of stacks which will lead into stack overflow. Try using "regular" iteration like do-while or for. Of course you can still use some ingredient like memoization in "regular" loop.
Oh i forget something. Perhaps the stack overflow occurs because of arithmetic overflow. Since you use Integer, maybe Java "change" those "flying numbers" into a negative number when arithmetic overflow occurs. And as seen in method seqCount(int), you don't check invariant n > 0.
You can solve this problem not only with recursion but also with a single loop. there is overflow if you write int. because it generates long while chaning and the recursion never ends because never equal to 1 and you probably get stackoverflow error
Here is my solution with loop and recursion:
public class Collatz {
public int getChainLength(long i) {
int count = 1;
while (i != 1) {
count++;
if (i % 2 == 0) {
i /= 2;
} else {
i = 3 * i + 1;
}
}
return count;
}
public static int getChainLength(long i, int count) {
if (i == 1) {
return count;
} else if (i % 2 == 0) {
return getChainLength(i / 2, count + 1);
} else {
return getChainLength(3 * i + 1, count + 1);
}
}
public int getLongestChain(int number) {
int longestChain[] = { 0, 0 };
for (int i = 1; i < number; i++) {
int chain = getChainLength(i);
if (longestChain[1] < chain) {
longestChain[0] = i;
longestChain[1] = chain;
}
}
return longestChain[0];
}
/**
* #param args
*/
public static void main(String[] args) {
System.out.println(new Collatz().getLongestChain(1000000));
}
}
Here you can have a look at my recursive implementation of problem 14:
http://chmu.bplaced.net/?p=265
import java .util.*;
public class file
{
public static void main(String [] args)
{
long largest=0;
long number=0;
for( long i=106239;i<1000000;i=i+2)
{
long k=1;
long z=i;
while(z!=1)
{
if(z%2==0)
{
k++;
z=z/2;
} else{
k++;
z=3*z+1;
}
}
if(k>largest)
{
number=i;
largest=k;
System.out.println(number+" "+largest);
}
}//for loop
}//main
}

Categories