How to implement a Median-heap - java

Like a Max-heap and Min-heap, I want to implement a Median-heap to keep track of the median of a given set of integers. The API should have the following three functions:
insert(int) // should take O(logN)
int median() // will be the topmost element of the heap. O(1)
int delmedian() // should take O(logN)
I want to use an array (a) implementation to implement the heap where the children of array index k are stored in array indices 2*k and 2*k + 1. For convenience, the array starts populating elements from index 1.
This is what I have so far:
The Median-heap will have two integers to keep track of number of integers inserted so far that are > current median (gcm) and < current median (lcm).
if abs(gcm-lcm) >= 2 and gcm > lcm we need to swap a[1] with one of its children.
The child chosen should be greater than a[1]. If both are greater,
choose the smaller of two.
Similarly for the other case. I can't come up with an algorithm for how to sink and swim elements. I think it should take into consideration how close the number is to the median, so something like:
private void swim(int k) {
while (k > 1 && absless(k, k/2)) {
exch(k, k/2);
k = k/2;
}
}
I can't come up with the entire solution though.

You need two heaps: one min-heap and one max-heap. Each heap contains about one half of the data. Every element in the min-heap is greater or equal to the median, and every element in the max-heap is less or equal to the median.
When the min-heap contains one more element than the max-heap, the median is in the top of the min-heap. And when the max-heap contains one more element than the min-heap, the median is in the top of the max-heap.
When both heaps contain the same number of elements, the total number of elements is even.
In this case you have to choose according your definition of median: a) the mean of the two middle elements; b) the greater of the two; c) the lesser; d) choose at random any of the two...
Every time you insert, compare the new element with those at the top of the heaps in order to decide where to insert it. If the new element is greater than the current median, it goes to the min-heap. If it is less than the current median, it goes to the max heap. Then you might need to rebalance. If the sizes of the heaps differ by more than one element, extract the min/max from the heap with more elements and insert it into the other heap.
In order to construct the median heap for a list of elements, we should first use a linear time algorithm and find the median. Once the median is known, we can simply add elements to the Min-heap and Max-heap based on the median value. Balancing the heaps isn't required because the median will split the input list of elements into equal halves.
If you extract an element you might need to compensate the size change by moving one element from one heap to another. This way you ensure that, at all times, both heaps have the same size or differ by just one element.

Here is a java implementaion of a MedianHeap, developed with the help of above comocomocomocomo 's explanation .
import java.util.Arrays;
import java.util.Comparator;
import java.util.PriorityQueue;
import java.util.Scanner;
/**
*
* #author BatmanLost
*/
public class MedianHeap {
//stores all the numbers less than the current median in a maxheap, i.e median is the maximum, at the root
private PriorityQueue<Integer> maxheap;
//stores all the numbers greater than the current median in a minheap, i.e median is the minimum, at the root
private PriorityQueue<Integer> minheap;
//comparators for PriorityQueue
private static final maxHeapComparator myMaxHeapComparator = new maxHeapComparator();
private static final minHeapComparator myMinHeapComparator = new minHeapComparator();
/**
* Comparator for the minHeap, smallest number has the highest priority, natural ordering
*/
private static class minHeapComparator implements Comparator<Integer>{
#Override
public int compare(Integer i, Integer j) {
return i>j ? 1 : i==j ? 0 : -1 ;
}
}
/**
* Comparator for the maxHeap, largest number has the highest priority
*/
private static class maxHeapComparator implements Comparator<Integer>{
// opposite to minHeapComparator, invert the return values
#Override
public int compare(Integer i, Integer j) {
return i>j ? -1 : i==j ? 0 : 1 ;
}
}
/**
* Constructor for a MedianHeap, to dynamically generate median.
*/
public MedianHeap(){
// initialize maxheap and minheap with appropriate comparators
maxheap = new PriorityQueue<Integer>(11,myMaxHeapComparator);
minheap = new PriorityQueue<Integer>(11,myMinHeapComparator);
}
/**
* Returns empty if no median i.e, no input
* #return
*/
private boolean isEmpty(){
return maxheap.size() == 0 && minheap.size() == 0 ;
}
/**
* Inserts into MedianHeap to update the median accordingly
* #param n
*/
public void insert(int n){
// initialize if empty
if(isEmpty()){ minheap.add(n);}
else{
//add to the appropriate heap
// if n is less than or equal to current median, add to maxheap
if(Double.compare(n, median()) <= 0){maxheap.add(n);}
// if n is greater than current median, add to min heap
else{minheap.add(n);}
}
// fix the chaos, if any imbalance occurs in the heap sizes
//i.e, absolute difference of sizes is greater than one.
fixChaos();
}
/**
* Re-balances the heap sizes
*/
private void fixChaos(){
//if sizes of heaps differ by 2, then it's a chaos, since median must be the middle element
if( Math.abs( maxheap.size() - minheap.size()) > 1){
//check which one is the culprit and take action by kicking out the root from culprit into victim
if(maxheap.size() > minheap.size()){
minheap.add(maxheap.poll());
}
else{ maxheap.add(minheap.poll());}
}
}
/**
* returns the median of the numbers encountered so far
* #return
*/
public double median(){
//if total size(no. of elements entered) is even, then median iss the average of the 2 middle elements
//i.e, average of the root's of the heaps.
if( maxheap.size() == minheap.size()) {
return ((double)maxheap.peek() + (double)minheap.peek())/2 ;
}
//else median is middle element, i.e, root of the heap with one element more
else if (maxheap.size() > minheap.size()){ return (double)maxheap.peek();}
else{ return (double)minheap.peek();}
}
/**
* String representation of the numbers and median
* #return
*/
public String toString(){
StringBuilder sb = new StringBuilder();
sb.append("\n Median for the numbers : " );
for(int i: maxheap){sb.append(" "+i); }
for(int i: minheap){sb.append(" "+i); }
sb.append(" is " + median()+"\n");
return sb.toString();
}
/**
* Adds all the array elements and returns the median.
* #param array
* #return
*/
public double addArray(int[] array){
for(int i=0; i<array.length ;i++){
insert(array[i]);
}
return median();
}
/**
* Just a test
* #param N
*/
public void test(int N){
int[] array = InputGenerator.randomArray(N);
System.out.println("Input array: \n"+Arrays.toString(array));
addArray(array);
System.out.println("Computed Median is :" + median());
Arrays.sort(array);
System.out.println("Sorted array: \n"+Arrays.toString(array));
if(N%2==0){ System.out.println("Calculated Median is :" + (array[N/2] + array[(N/2)-1])/2.0);}
else{System.out.println("Calculated Median is :" + array[N/2] +"\n");}
}
/**
* Another testing utility
*/
public void printInternal(){
System.out.println("Less than median, max heap:" + maxheap);
System.out.println("Greater than median, min heap:" + minheap);
}
//Inner class to generate input for basic testing
private static class InputGenerator {
public static int[] orderedArray(int N){
int[] array = new int[N];
for(int i=0; i<N; i++){
array[i] = i;
}
return array;
}
public static int[] randomArray(int N){
int[] array = new int[N];
for(int i=0; i<N; i++){
array[i] = (int)(Math.random()*N*N);
}
return array;
}
public static int readInt(String s){
System.out.println(s);
Scanner sc = new Scanner(System.in);
return sc.nextInt();
}
}
public static void main(String[] args){
System.out.println("You got to stop the program MANUALLY!!");
while(true){
MedianHeap testObj = new MedianHeap();
testObj.test(InputGenerator.readInt("Enter size of the array:"));
System.out.println(testObj);
}
}
}

Here my code based on the answer provided by comocomocomocomo :
import java.util.PriorityQueue;
public class Median {
private PriorityQueue<Integer> minHeap =
new PriorityQueue<Integer>();
private PriorityQueue<Integer> maxHeap =
new PriorityQueue<Integer>((o1,o2)-> o2-o1);
public float median() {
int minSize = minHeap.size();
int maxSize = maxHeap.size();
if (minSize == 0 && maxSize == 0) {
return 0;
}
if (minSize > maxSize) {
return minHeap.peek();
}if (minSize < maxSize) {
return maxHeap.peek();
}
return (minHeap.peek()+maxHeap.peek())/2F;
}
public void insert(int element) {
float median = median();
if (element > median) {
minHeap.offer(element);
} else {
maxHeap.offer(element);
}
balanceHeap();
}
private void balanceHeap() {
int minSize = minHeap.size();
int maxSize = maxHeap.size();
int tmp = 0;
if (minSize > maxSize + 1) {
tmp = minHeap.poll();
maxHeap.offer(tmp);
}
if (maxSize > minSize + 1) {
tmp = maxHeap.poll();
minHeap.offer(tmp);
}
}
}

Isn't a perfectly balanced binary search tree (BST) a median heap? It is true that even red-black BSTs aren't always perfectly balanced, but it might be close enough for your purposes. And log(n) performance is guaranteed!
AVL trees are more tighly balanced than red-black BSTs so they come even closer to being a true median heap.

Here is a Scala implementation, following the comocomocomocomo's idea above.
class MedianHeap(val capacity:Int) {
private val minHeap = new PriorityQueue[Int](capacity / 2)
private val maxHeap = new PriorityQueue[Int](capacity / 2, new Comparator[Int] {
override def compare(o1: Int, o2: Int): Int = Integer.compare(o2, o1)
})
def add(x: Int): Unit = {
if (x > median) {
minHeap.add(x)
} else {
maxHeap.add(x)
}
// Re-balance the heaps.
if (minHeap.size - maxHeap.size > 1) {
maxHeap.add(minHeap.poll())
}
if (maxHeap.size - minHeap.size > 1) {
minHeap.add(maxHeap.poll)
}
}
def median: Double = {
if (minHeap.isEmpty && maxHeap.isEmpty)
return Int.MinValue
if (minHeap.size == maxHeap.size) {
return (minHeap.peek+ maxHeap.peek) / 2.0
}
if (minHeap.size > maxHeap.size) {
return minHeap.peek()
}
maxHeap.peek
}
}

Another way to do it without using a max-heap and a min-heap would be to use a median-heap right away.
In a max-heap, the parent is greater than the children.
We can have a new type of heap where the parent is in the 'middle' of the children - the left child is smaller than the parent and the right child is greater than the parent. All even entries are left children and all odd entries are right children.
The same swim and sink operations which can be performed in a max-heap, can also be performed in this median-heap - with slight modifications. In a typical swim operation in a max-heap, the inserted entry swims up till it is smaller than a parent entry, here in a median-heap, it will swim up till it is lesser than a parent (if it is an odd entry) or greater than a parent (if it is an even entry).
Here's my implementation for this median-heap. I have used an array of Integers for simplicity.
package priorityQueues;
import edu.princeton.cs.algs4.StdOut;
public class MedianInsertDelete {
private Integer[] a;
private int N;
public MedianInsertDelete(int capacity){
// accounts for '0' not being used
this.a = new Integer[capacity+1];
this.N = 0;
}
public void insert(int k){
a[++N] = k;
swim(N);
}
public int delMedian(){
int median = findMedian();
exch(1, N--);
sink(1);
a[N+1] = null;
return median;
}
public int findMedian(){
return a[1];
}
// entry swims up so that its left child is smaller and right is greater
private void swim(int k){
while(even(k) && k>1 && less(k/2,k)){
exch(k, k/2);
if ((N > k) && less (k+1, k/2)) exch(k+1, k/2);
k = k/2;
}
while(!even(k) && (k>1 && !less(k/2,k))){
exch(k, k/2);
if (!less (k-1, k/2)) exch(k-1, k/2);
k = k/2;
}
}
// if the left child is larger or if the right child is smaller, the entry sinks down
private void sink (int k){
while(2*k <= N){
int j = 2*k;
if (j < N && less (j, k)) j++;
if (less(k,j)) break;
exch(k, j);
k = j;
}
}
private boolean even(int i){
if ((i%2) == 0) return true;
else return false;
}
private void exch(int i, int j){
int temp = a[i];
a[i] = a[j];
a[j] = temp;
}
private boolean less(int i, int j){
if (a[i] <= a[j]) return true;
else return false;
}
public static void main(String[] args) {
MedianInsertDelete medianInsertDelete = new MedianInsertDelete(10);
for(int i = 1; i <=10; i++){
medianInsertDelete.insert(i);
}
StdOut.println("The median is: " + medianInsertDelete.findMedian());
medianInsertDelete.delMedian();
StdOut.println("Original median deleted. The new median is " + medianInsertDelete.findMedian());
}
}

Related

How to heapify Max-heap?

I'm trying to implement a Max-heap with with two methods insert and extract_max.
But the extract_max is currently not working correctly as it's not extracting the largest integer in the Heap, which i assume is because of heapify. I've been trying to debug for hours but can't figure out where it goes wrong. Any input would be highly appreciated.
class Heap {
int heap_array[];
int n_elems = 0;
int capacity;
// Constructor
Heap(int _capacity) {
capacity = _capacity;
heap_array = new int[capacity];
}
/**
* Private method for maintaining the heap.
* #param i, index of the element to heapify from
**/
private void heapify(int i) {
int left = 2*i + 1;
int right = 2*i+ 2;
int largest = i;
//if left ≤ heap_length[A] and A[left] > A[largest] then:
if (left <= n_elems && heap_array[left] > heap_array[largest]) {
largest = left;
//System.out.println("largest = left");
}
//if right ≤ heap_length[A] and A[right] > A[largest] then:
if (right <= n_elems && heap_array[right] > heap_array[largest]) {
//System.out.println("largest = right");
largest = right;
}
//if largest ≠ i then:
if (largest != i) {
int swap = heap_array[i];
heap_array[i] = heap_array[largest];
heap_array[largest] = swap;
// Recursively heapify the affected sub-tree
heapify(largest);
}
}
/**
* Add an element to the heap and ensure the heap property
* Throws an exception if trying to add elements to a full heap.
* #param x Element to add
*/
public void insert(int x) throws Exception {
if(is_full()) {
throw new Exception("The heap is full");
} else {
// Insert the element at end of Heap
heap_array[n_elems++] = x;
//n_elems++;
// Heapify from root
heapify(0);
}
}
public int extract_max() throws Exception {
//Get the largest
// Get the last element
int root = heap_array[0];
int lastElement = heap_array[n_elems];
// Replace root with first element
heap_array[0] = lastElement;
// Decrease size of heap by 1
n_elems--;
// heapify the root node
heapify(0);
// return new size of Heap
return root;
}
public int capacity() {
return capacity;
}
public int size() {
return n_elems;
}
public boolean is_empty() {
return n_elems == 0;
}
public boolean is_full() {
return n_elems == capacity;
}
public void print() {
for(int i = 0; i < n_elems; i++) {
System.out.println(heap_array[i]);
}
}
/**
* Remove and return largest element, and maintain the heap property.
* Throws an exception if trying to extract an element from an empty heap.
*/
/**
* For convenience, a small program to test the code.
* There are better ways of doing this kind of testing!
* #throws Exception
*
*/
static public void main(String args[]) throws Exception { // A simple test program
// Declare two heaps. Both should work nicely!
Heap h1 = new Heap(100);
Heap h2 = new Heap(10);
int data[] = {1, 4, 10, 14, 7, 9, 3, 8, 16};
//
// Insert 1 element to heap 1, and several to heap 2.
//
h2.insert(9);
h2.insert(10);
h2.insert(8);
h2.insert(11);
h2.insert(12);
h2.insert(15);
System.out.println("Size " + h2.size());
h2.print();
System.out.println("Max " + h2.extract_max());
}
}
The first problem is that your insert isn't correct. Just adding to the end and calling heapify(0) doesn't do you any good. heapify is going to examine the root element and its two children, decide that the root is the largest item, and exit, doing nothing. As a result, you're just adding things to the list sequentially.
To insert into a max-heap, you do the following:
Add the new item to the end of the heap.
Move the item up the heap to its proper position.
So insert should look like this:
public void insert(int x) throws Exception {
if(is_full()) {
throw new Exception("The heap is full");
}
// Insert the element at end of Heap
heap_array[n_elems++] = x;
// now sift it up
int current = nelems-1;
int parent = (current-1)/2;
while (current > 0 && heap_array[current] > heap_array[parent]) {
int swap = heap_array[parent];
heap_array[parent] = heap_array[current];
heap_array[current] = swap;
current = parent;
parent = (current-1)/2;
}
}
I think you also have a problem in extract_max. You have:
int lastElement = heap_array[n_elems];
But the last element is actually at index n_elems-1]. I think you want:
int lastElement = heap_array[n_elems-1];
That makes sense because if n_elems == 1, then the only item in the heap will be the root, at heap_array[0];

Calculate BIg-O for 3 random permutation algorithms

I am trying to calculate the Big-O time complexity for these 3 algorithms, but seems like I have a lack of knowledge on this topic.
1st:
private void firstAlgorithm(int size) {
int[] array = new int[size];
int i=0; int flag=0;
while(i<size) {
int num=(int)(Math.random()*(size));
if (num==0 && flag==0) {
flag=1;
array[i]=0;
i++;
} else if(num==0 && flag==1) {
continue;
} else if(!checkVal(num, array)) {
array[i]=num;
i++;
}
}
}
private static boolean checkVal(int val, int[] arr) {
int i = 0;
for (int num:arr) {
if (num==val) {
return true;
}
}
return false;
}
2nd:
private void secondAlgorithm(int size) {
int i = 0;
int[] array = new int[size];
boolean[] booleanArray = new boolean[size];
while (i < array.length) {
int num = (int) (Math.random() * array.length);
if (!booleanArray[num]) {
booleanArray[num] = true;
array[i] = num;
i++;
}
}
}
3rd:
private void thirdAlgorithm(int size) {
int[] array = new int[size];
for (int i = 0; i < array.length; i++) {
int num = (int) (Math.random() * (i - 1));
if (i > 0) {
array = swap(array, i, num);
}
}
}
private static int[] swap(int[] arr, int a, int b) {
int i = arr[a];
arr[a] = arr[b];
arr[b] = i;
return arr;
}
Would be nice, if you could explain your results.
In my opinion, 1st - O(n^2) because of two loops, 2nd don't know, 3rd O(n)
THank you
I assume that in all your algorithms, where you are generating a random number, you meant to take the remainder of the generated number, not multiplying it with another value (example for the first algorithm: Math.random() % size). If this is not the case, then any of the above algorithms have a small chance of not finishing in a reasonable amount of time.
The first algorithm generates and fills an array of size integers. The rule is that the array must contain only one value of 0 and only distinct values. Checking if the array already contains a newly generated value is done in O(m) where m is the number of elements already inserted in the array. You might do this check for each of the size elements which are to be inserted and m can get as large as size, so an upper bound of the running-time is O(size^2).
The second algorithm also generates and fills an array with random numbers, but this time the numbers need not be distinct, so no need to run an additional O(m) check each iteration. The overall complexity is given by the size of the array: O(size).
The third algorithm generates and fills an array with random numbers and at each iteration it swaps some elements based on the given index, which is a constant time operation. Also, reassigning the reference of the array to itself is a constant time operation (O(1)). It results that the running-time is bounded by O(size).

Passing Generic Quick sort

I am very confused with passing. I have created a Quick sort algorithm in eclipse. The class is an abstract class. Here is the Interface class.
public interface ArraySort<T extends Comparable<T>>
{
/**
* Sort the array
*/
public void sort(T[] array);
}
This is the class in which the Quick sort has been created.
public class QuickSort <T extends Comparable<T>> extends ArraySortTool<T>{
public <T> void quickSort(T[] array, Comparator<T>com, int a, int b) {
if(a >= b) return;
int left = a;
int right = b-1;
T pivot = array[b];
T temp;
while (left <= right){
//Look for element larger or equal to the pivot
while(left <= right&&com.compare(array[left], pivot)<0)left++;
//Look for element smaller or equal to pivot
while(left <= right&&com.compare(array[right], pivot)>0)right--;
if(left <= right){
temp = array[left]; array[right]=array[right]=temp;
left++; right--;
}
}
//place pivot into its final location marked by left index
temp = array[left]; array[left] = array[b]; array[b] = temp;
quickSort(array, com, a, left - 1);
quickSort(array, com, left + 1,b);
}
#Override
public void sort(T[] array) {
quickSort(array, int, 0, 0);
}
}
In order to pass the references I have also tried this method but had no luck.
#Override
public void sort(T[] array, Comparator<T>com, int a, int b) {
int left = a;
int right = b-1;
T pivot = array[b];
T temp;
I was getting an error here
public class QuickSort <T extends Comparable<T>> extends ArraySortTool<T>{
I am trying to do this without interfering with the interface class.
Here is the code for the ArraySortTool
public abstract class ArraySortTool<T extends Comparable<T>> implements ArraySort<T>
{
/**
* #param inArray an array to be sorted
* #return the time, in milliseconds, taken to sort the array
*/
private double timeTakenMillis(T[] array) {
double startTime = System.nanoTime();
sort(array);
return ((System.nanoTime()-startTime)/1000000.0);
}
/**
* Run a sequence of tests on sets of arrays of increasing size, reporting the average time taken for each
* size of array. For each size of array, <tt>noPerSize</tt> tests will be run, and the average time taken.
* Timings will be generated for array sizes 1,2,...,9,10,20,...,90,100,200,...,900,1000,2000,...until the
* maximum time is exceeded. Times are reported in milliseconds.
* #param generator an array generator for generating the random arrays
* #param noPerSize the number of timings per array size set
* #param maxTimeSeconds the cut-off time in seconds - once a timing takes longer than this the timing sequence will be terminated
*/
public void timeInMillis(RandomArray<T> generator,int noPerSize,int maxTimeSeconds)
{
int size = 1; // initial size of array to test
int step = 1; // initial size increase
int stepFactor = 10; // when size reaches 10*current size increase step size by 10
double averageTimeTaken;
do {
double totalTimeTaken = 0;
for (int count = 0; count < noPerSize; count++) {
T[] array = generator.randomArray(size);
totalTimeTaken += timeTakenMillis(array);
}
averageTimeTaken = totalTimeTaken/noPerSize;
System.out.format("Average time to sort %d elements was %.3f milliseconds.\n",size,averageTimeTaken);
size += step;
if (size >= stepFactor*step) step *= stepFactor;
} while (averageTimeTaken < maxTimeSeconds*1000);
System.out.println("Tests ended.");
}
/**
* Check whether a given array is sorted.
* #param array the array to be checked
* #return true iff the array is sorted - either ascending or descending
* The first non-equal neighbouring elements will determine the expected
* order of sorting.
*/
public boolean isSorted(T[] array) {
int detectedDirection = 0; // have not yet detected increasing or decreasing
T previous = array[0];
for (int index = 1; index < array.length; index++) {
int currentDirection = previous.compareTo(array[index]); // compare previous and current entry
if (currentDirection != 0) { // if current pair increasing or decreasing
if (detectedDirection == 0) { // if previously no direction detected
detectedDirection = currentDirection; // remember current direction
} else if (detectedDirection * currentDirection < 0) { // otherwise compare current and previous direction
return false; // if they differ array is not sorted
}
}
previous = array[index];
}
// reached end of array without detecting pairs out of order
return true;
}
}
I am trying to pass the quicksort method into the sort method as it is in the interface class. Please let me know how to do this as I am new to passing by reference. An example using my code will be great. Kind regards.
try this:
public class QuickSort <T extends Comparable<T>> implements ArraySort<T>{...}
(edited to match revised code in the question)

How do i implement heapSort on my heap?

Okay so this is one of my last assignments and of course this is creating the most stress for me but the only thing keeping me from turning this assignment in is being able to apply heapsort on the Heap that the user inputs their own integer values into an array list which is displayed and here is the code for that:
The heap propgram works fine but the Heapsort doesn't work or i can't use it or make a call for it in the HeapApp class
import java.lang.reflect.Array;
import java.util.ArrayList;
import java.util.NoSuchElementException;
import java.util.Scanner;
/**
*/
public class Heap<T extends Comparable<T>> {
private ArrayList<T> items;
public Heap() {
items = new ArrayList<T>();
}
private void siftUp() {
int k = items.size() - 1;
while (k > 0) {
int p = (k-1)/2;
T item = items.get(k);
T parent = items.get(p);
if (item.compareTo(parent) > 0) {
// swap
items.set(k, parent);
items.set(p, item);
// move up one level
k = p;
} else {
break;
}
}
}
public void insert(T item) {
items.add(item);
siftUp();
}
private void siftDown() {
int k = 0;
int l = 2*k+1;
while (l < items.size()) {
int max=l, r=l+1;
if (r < items.size()) { // there is a right child
if (items.get(r).compareTo(items.get(l)) > 0) {
max++;
}
}
if (items.get(k).compareTo(items.get(max)) < 0) {
// switch
T temp = items.get(k);
items.set(k, items.get(max));
items.set(max, temp);
k = max;
l = 2*k+1;
} else {
break;
}
}
}
public T delete()
throws NoSuchElementException {
if (items.size() == 0) {
throw new NoSuchElementException();
}
if (items.size() == 1) {
return items.remove(0);
}
T hold = items.get(0);
items.set(0, items.remove(items.size()-1));
siftDown();
return hold;
}
public int size() {
return items.size();
}
public boolean isEmpty() {
return items.isEmpty();
}
public String toString() {
return items.toString();
}
//----------------------------------------------------------------------------------------------------------------------------------------
public class Heapsort<T extends Comparable<T>> {
/**
* Sort the array a[0..n-1] by the heapsort algorithm.
*
* #param a the array to be sorted
* #param n the number of elements of a that have valid values
*/
public void sort(T[] a, int n) {
heapsort(a, n - 1);
}
/**
* Sort the ArrayList list by the heapsort algorithm.
* Works by converting the ArrayList to an array, sorting the
* array, and converting the result back to the ArrayList.
*
* #param list the ArrayList to be sorted
*/
public void sort(ArrayList<T> items) {
// Convert list to an array.
#SuppressWarnings("unchecked")
T[] a = (T[]) items.toArray((T[]) Array.newInstance(items.get(0).getClass(), items.size()));
sort(a, items.size()); // sort the array
// Copy the sorted array elements back into the list.
for (int i = 0; i < a.length; i++)
items.set(i, a[i]);
}
/**
* Sort the array a[0..lastLeaf] by the heapsort algorithm.
*
* #param items the array holding the heap
* #param lastLeaf the position of the last leaf in the array
*/
private void heapsort(T[] items, int lastLeaf) {
// First, turn the array a[0..lastLeaf] into a max-heap.
buildMaxHeap(items, lastLeaf);
// Once the array is a max-heap, repeatedly swap the root
// with the last leaf, putting the largest remaining element
// in the last leaf's position, declare this last leaf to no
// longer be in the heap, and then fix up the heap.
while (lastLeaf > 0) {
swap(items, 0, lastLeaf); // swap the root with the last leaf
lastLeaf--; // the last leaf is no longer in the heap
maxHeapify(items, 0, lastLeaf); // fix up what's left
}
}
/**
* Restore the max-heap property. When this method is called, the max-heap
* property holds everywhere, except possibly at node i and its children. When
* this method returns, the max-heap property holds everywhere.
*
* #param items the array holding the heap
* #param i index of the node that might violate the max-heap property
* #param lastLeaf the position of the last leaf in the array
*/
private void maxHeapify(T[] items, int i, int lastLeaf) {
int left = leftChild(i); // index of node i's left child
int right = rightChild(i); // index of node i's right child
int largest; // will hold the index of the node with the largest element
// among node i, left, and right
// Is there a left child and, if so, does the left child have an
// element larger than node i?
if (left <= lastLeaf && items[left].compareTo(items[i]) > 0)
largest = left; // yes, so the left child is the largest so far
else
largest = i; // no, so node i is the largest so far
// Is there a left child and, if so, does the right child have an
// element larger than the larger of node i and the left child?
if (right <= lastLeaf && items[right].compareTo(items[largest]) > 0)
largest = right; // yes, so the right child is the largest
// If node i holds an element larger than both the left and right
// children, then the max-heap property already held, and we need do
// nothing more. Otherwise, we need to swap node i with the larger
// of the two children, and then recurse down the heap from the larger
// child.
if (largest != i) {
swap(items, i, largest);
maxHeapify(items, largest, lastLeaf);
}
}
/**
* Form array a[0..lastLeaf] into a max-heap.
*
* #param items array to be heapified
* #param lastLeaf position of last valid data in a
*/
private void buildMaxHeap(T[] items, int lastLeaf) {
int lastNonLeaf = (lastLeaf - 1) / 2; // nodes lastNonLeaf+1 to lastLeaf are leaves
for (int j = lastNonLeaf; j >= 0; j--)
maxHeapify(items, j, lastLeaf);
}
/**
* Swap two locations i and j in array a.
*
* #param items the array
* #param i first position
* #param j second position
*/
private void swap(T[] items, int i, int j) {
T t = items[i];
items[i] = items[j];
items[j] = t;
}
/**
* Return the index of the left child of node i.
*
* #param i index of the parent node
* #return index of the left child of node i
*/
private int leftChild(int i) {
return 2 * i + 1;
}
/**
* Return the index of the right child of node i.
*
* #param i index of the parent node
* #return the index of the right child of node i
*/
private int rightChild(int i) {
return 2 * i + 2;
}
/**
* For debugging and testing, print out an array.
*
* #param a the array to print
* #param n number of elements of a to print
*/
public void printArray(T[] items, int n) {
for (int i = 0; i < n; i++)
System.out.println(items[i]);
}
}
}
import java.util.Scanner;
public class HeapApp{
/**
* #param args
*/
public static void main(String[] args) {
Heap<Integer> hp = new Heap<Integer>();
Scanner sc = new Scanner(System.in);
System.out.print("Enter next int, 'done' to stop: ");
String line = sc.next();
while (!line.equals("done")) {
hp.insert(Integer.parseInt(line));
System.out.println(hp);
System.out.print("Enter next int, 'done' to stop: ");
line = sc.next();
}
while (hp.isEmpty()) {
//int max = hp.delete();
System.out.println( " " + hp);
}
System.out.println(hp);
System.out.println("After sorting " + hp);
}
}
Now i'm not asking anyone to do it for me but i just need help figuring out how to get the Heapsort to work with the heap PLEASE HELP! The most i have tried is setting the parameters within the Heap sort method.
My question and code is not a duplicate for one this is based on a Heap and heapsort from the user input:
public static void main(String[] args) {
Heap<Integer> hp = new Heap<Integer>();
Scanner sc = new Scanner(System.in);
System.out.print("Enter next int, 'done' to stop: ");
String line = sc.next();
while (!line.equals("done")) {
hp.insert(Integer.parseInt(line));
System.out.println(hp);
System.out.print("Enter next int, 'done' to stop: ");
line = sc.next();
}
Also the entire Heap is implemented using an ArrayList:
public class Heap<T extends Comparable<T>> {
private ArrayList<T> items;
public Heap() {
items = new ArrayList<T>();
}
Add a sort method to your Heap class like this:
public void sort()
{
new Heapsort<T>().sort(items);
}
Then in your HeapApp class call the sort method before printing it out:
hp.sort();
System.out.println("After sorting " + hp);

Fixed-size collection that keeps top (N) values in Java

I need to keep top N(< 1000) integers while trying to add values from a big list of integers(around a million sized lazy list). I want to be try adding values to a collection but that needs to keep only the top N(highest values) integers. Is there any preferred data structure to use for this purpose ?
I'd suggest to use some sorted data structure, such as TreeSet. Before insertion, check the number of items in the set, and if it reached 1000, remove the smallest number if it's smaller than the newly added number, and add the new number.
TreeSet<Integer> set = ...;
public void add (int n) {
if (set.size () < 1000) {
set.add (n);
} else {
Integer first = set.first();
if (first.intValue() < n) {
set.pollFirst();
set.add (n);
}
}
}
Google Guava MinMaxPriorityQueue class.
You can also use custom sorting by using a comparator (Use orderedBy(Comparator<B> comparator) method).
Note: This collection is NOT a sorted collection.
See javadoc
Example:
#Test
public void test() {
final int maxSize = 5;
// Natural order
final MinMaxPriorityQueue<Integer> queue = MinMaxPriorityQueue
.maximumSize(maxSize).create();
queue.addAll(Arrays.asList(10, 30, 60, 70, 20, 80, 90, 50, 100, 40));
assertEquals(maxSize, queue.size());
assertEquals(new Integer(50), Collections.max(queue));
System.out.println(queue);
}
Output:
[10, 50, 40, 30, 20]
One efficient solution is a slightly tweaked array-based priority queue using a binary min-heap.
First N integers are simply added to the heap one by one or you can build it from array of first N integers (slightly faster).
After that, compare the incoming integer with the root element (which is MIN value found so far). If the new integer is larger that that, simply replace the root with this new integer and perform down-heap operation (i.e. trickle down the new integer until both its children are smaller or it becomes a leaf). The data structure guarantees you will always have N largest integers so far with average addition time of O(log N).
Here is my C# implementation, the mentioned method is named "EnqueueDown". The "EnqueueUp" is a standard enqueue operation that expands the array, adds new leaf and trickles it up.
I have tested it on 1M numbers with max heap size of 1000 and it runs under 200 ms:
namespace ImagingShop.Research.FastPriorityQueue
{
using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
using System.Runtime.CompilerServices;
public sealed class FastPriorityQueue<T> : IEnumerable<Tuple<T, float>>
{
private readonly int capacity;
private readonly Tuple<T, float>[] nodes;
private int count = 0;
public FastPriorityQueue(int capacity)
{
this.capacity = capacity;
this.nodes = new Tuple<T, float>[capacity];
}
public int Capacity => this.capacity;
public int Count => this.count;
public T FirstNode => this.nodes[0].Item1;
public float FirstPriority => this.nodes[0].Item2;
public void Clear()
{
this.count = 0;
}
public bool Contains(T node) => this.nodes.Any(tuple => Equals(tuple.Item1, node));
public T Dequeue()
{
T nodeHead = this.nodes[0].Item1;
int index = (this.count - 1);
this.nodes[0] = this.nodes[index];
this.count--;
DownHeap(index);
return nodeHead;
}
public void EnqueueDown(T node, float priority)
{
if (this.count == this.capacity)
{
if (priority < this.nodes[0].Item2)
{
return;
}
this.nodes[0] = Tuple.Create(node, priority);
DownHeap(0);
return;
}
int index = this.count;
this.count++;
this.nodes[index] = Tuple.Create(node, priority);
UpHeap(index);
}
public void EnqueueUp(T node, float priority)
{
int index = this.count;
this.count++;
this.nodes[index] = Tuple.Create(node, priority);
UpHeap(index);
}
public IEnumerator<Tuple<T, float>> GetEnumerator()
{
for (int i = 0; i < this.count; i++) yield return this.nodes[i];
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private void DownHeap(int index)
{
while (true)
{
int indexLeft = (index << 1);
int indexRight = (indexLeft | 1);
int indexMin = ((indexLeft < this.count) && (this.nodes[indexLeft].Item2 < this.nodes[index].Item2))
? indexLeft
: index;
if ((indexRight < this.count) && (this.nodes[indexRight].Item2 < this.nodes[indexMin].Item2))
{
indexMin = indexRight;
}
if (indexMin == index)
{
break;
}
Flip(index, indexMin);
index = indexMin;
}
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private void Flip(int indexA, int indexB)
{
var temp = this.nodes[indexA];
this.nodes[indexA] = this.nodes[indexB];
this.nodes[indexB] = temp;
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private void UpHeap(int index)
{
while (true)
{
if (index == 0)
{
break;
}
int indexParent = (index >> 1);
if (this.nodes[indexParent].Item2 <= this.nodes[index].Item2)
{
break;
}
Flip(index, indexParent);
index = indexParent;
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
}
The basic implementation is taken from "Cormen, Thomas H. Introduction to algorithms. MIT press, 2009."
In Java 1.7 one may use java.util.PriorityQueue. To keep the top N items you need to use reverse comparator, e.g. for integers you order them descending. In this manner the smallest number is always on top and could be removed if to many items in queue.
package eu.pawelsz.example.topn;
import java.util.Comparator;
import java.util.PriorityQueue;
public class TopN {
public static <E> void add(int keep, PriorityQueue<E> priorityQueue, E element) {
if (keep == priorityQueue.size()) {
priorityQueue.poll();
}
priorityQueue.add(element);
}
public static void main(String[] args) {
int N = 4;
PriorityQueue<Integer> topN = new PriorityQueue<>(N, new Comparator<Integer>() {
#Override
public int compare(Integer o1, Integer o2) {
return o1 - o2;
}
});
add(N, topN, 1);
add(N, topN, 2);
add(N, topN, 3);
add(N, topN, 4);
System.out.println("smallest: " + topN.peek());
add(N, topN, 8);
System.out.println("smallest: " + topN.peek());
add(N, topN, 5);
System.out.println("smallest: " + topN.peek());
add(N, topN, 2);
System.out.println("smallest: " + topN.peek());
}
}
// this Keep Top Most K Instance in Queue
public static <E> void add(int keep, PriorityQueue<E> priorityQueue, E element) {
if(priorityQueue.size()<keep){
priorityQueue.add(element);
}
else if(keep == priorityQueue.size()) {
priorityQueue.add(element); // size = keep +1 but
Object o = (Object)topN.toArray()[k-1];
topN.remove(o); // resized to keep
}
}
The fastest way is likely a simple array items = new Item[N]; and a revolving cursor int cursor = 0;. The cursor points to the insertion point of the next element.
To add a new element use the method
put(Item newItem) { items[cursor++] = newItem; if(cursor == N) cursor = 0; }
when accessing this structure you can make the last item added appear at index 0 via a small recalculation of the index, i.e.
get(int index) { return items[ cursor > index ? cursor-index-1 : cursor-index-1+N ]; }
(the -1 is because cursor always point at the next insertion point, i.e. cursor-1 is the last element added).
Summary: put(item) will add a new item. get(0) will get the last item added, get(1) will get the second last item, etc.
In case you need to take care of the case where n < N elements have been added you just need to check for null.
(TreeSets will likely be slower)
Your Question is answered here:
Size-limited queue that holds last N elements in Java
To summerize it:
No there is no data structure in the default java sdk, but Apache commons collections 4 has a CircularFifoQueue.

Categories