Java D-heap implementation - infinite loop in deleteMin() - java

This is my first time asking a question on here, and I'll do my best not to break any formal procedures.
I'm trying to implement a small (and generic) D-ary heap ( http://en.wikipedia.org/wiki/D-ary_heap ) in Java, with the help of Mark Allen Weiss binary heap-code (http://users.cis.fiu.edu/~weiss/dsaajava2/code/BinaryHeap.java) and the code is almost done. However, there seems to be a problem when testing the heap; the test case goes into an infinte loop and I don't know why. I'd really appreciate help with resolving the issue.
Here's the relevant part of the test case ("heap" is a 3-heap):
#Test
public void testFindMin() {
insert(3, 4, 6, 7, 1, 8, 2, 5);
assertTrue(heap.size() == 8);
assertTrue(heap.findMin() == 1);
heap.makeEmpty();
assertTrue(heap.isEmpty());
insert(182, 64, 233, 906, 42, 678);
assertTrue(heap.size() == 6);
assertTrue(heap.findMin() == 42);
heap.printHeap(); //The heap is 42, 64, 233, 906, 182, 678
assertTrue(heap.deleteMin() == 42); //Here's where it gets stuck
assertTrue(heap.size() == 5);
assertTrue(heap.findMin() == 64);
}
And here's my code:
public class MyMiniHeap<T extends Comparable<? super T>> implements MiniHeap<T> {
private int heapSize = 0;
private T[] heapArray;
private static final int DEFCAP = 10;
private int d;
public MyMiniHeap() {
this(2, DEFCAP);
}
public MyMiniHeap(int children) {
this(children, DEFCAP);
}
#SuppressWarnings("unchecked")
public MyMiniHeap(int children, int capacity) {
heapSize = 0;
d = children;
heapArray = (T[]) new Comparable[capacity + 1];
}
/**
* Inserts an element into the heap, placing it correctly according
* to heap properties.
*
* #param element the element to insert.
* #throws IllegalArgumentException if the element to insert is null.
*/
public void insert(T element) {
if (element == null)
throw new IllegalArgumentException("cannot insert null");
if (size() == heapArray.length - 1)
doubleArraySize();
int hole = ++heapSize;
for( ; hole > 1 && element.compareTo(heapArray[getParent(hole)]) < 0; hole = getParent(hole)) {
heapArray[hole] = heapArray[getParent(hole)];
}
heapArray[hole] = element;
}
/**
* Deletes the smallest element in the heap.
*
* #return the smallest element in the heap.
* #throws IllegalStateException if the heap is empty.
*/
public T deleteMin() {
if (isEmpty())
throw new IllegalStateException("Error: Empty heap");
T minItem = findMin();
heapArray[1] = heapArray[heapSize--];
percolateDown(1);
return minItem;
}
/**
* Checks if the heap is empty or not.
*
* #return true if the heap is empty, otherwise false.
*/
public T findMin() {
if (isEmpty())
throw new IllegalStateException("Error: Empty heap");
return heapArray[1];
}
private void percolateDown(int hole) {
int child = getChild(hole);
int tempChild = getChild(hole);
T tempElement = heapArray[hole];
for( ; getChild(hole) <= size(); hole = child) {
for(int i = 0; i < d && tempChild != size(); i++, tempChild++){
if(heapArray[tempChild + 1].compareTo(heapArray[child]) < 0){
child = tempChild + 1;
}
}
if (heapArray[child].compareTo(tempElement) < 0)
heapArray[hole] = heapArray[child];
else
break;
}
heapArray[hole] = tempElement;
}
#SuppressWarnings("unchecked")
private void doubleArraySize() {
T [] old = heapArray;
heapArray = (T [])new Comparable[old.length * 2];
for (int i = 0; i < old.length; i++)
heapArray[i] = old[i];
}
public boolean isEmpty() {
return size() == 0;
}
public void makeEmpty() {
heapSize = 0;
}
public int size() {
return heapSize;
}
/**
* Finds the index of the first child for a given parent's index.
* This method is normally private, but is used to test the
* correctness of the heap.
*
* #param parent the index of the parent.
* #return an integer with the index of the parent's first child.
*/
public int getChild(int parent) {
return d * (parent - 1) + 2;
}
/**
* Finds the index of a parent for a given child's index.
* This method is normally private, but is used to test
* the correctness of the heap.
*
* #param child the index of the child.
* #return an integer with the child's parent's index.
*/
public int getParent(int child) {
return (child - 2)/d + 1;
}
public void printHeap() {
String output = "";
for (int i = 1; i <= size(); i++)
output += heapArray[i].toString() + " ";
System.out.println(output);
}
}

I think that the bug is in this code:
for( ; getChild(hole) <= size(); hole = child) {
for(int i = 0; i < d && tempChild != size(); i++, tempChild++){
if(heapArray[tempChild + 1].compareTo(heapArray[child]) < 0){
child = tempChild + 1;
}
}
if (heapArray[child].compareTo(tempElement) < 0)
heapArray[hole] = heapArray[child];
else
break;
}
Notice that in this loop, you only change the value of child in the nested for loop, but never elsewhere. This means that if on some particular iteration none of the child nodes of the current node are less than the element at index child, then child is never reassigned and when you execute the loop step condition hole = child nothing will happen. It seems like if you got unlucky with your heap structure this could easily be causing your infinite loop.
Similarly, in this loop you're never reassigning tempChild, so on each iteration tempChild will pick up where it left off on the previous iteration. If on one of those iterations tempChild was equal to size, then the inner loop will never execute and each loop iteration will have no effect, again causing the infinite loop.
To fix this, I think you want to recompute tempChild and index on each iteration, as shown here:
for( ; getChild(hole) <= size(); hole = child) {
child = getChild(hole);
int tempChild = getChild(hole);
for(int i = 0; i < d && tempChild != size(); i++, tempChild++){
if(heapArray[tempChild + 1].compareTo(heapArray[child]) < 0){
child = tempChild + 1;
}
}
if (heapArray[child].compareTo(tempElement) < 0)
heapArray[hole] = heapArray[child];
else
break;
}
I'm not sure if this is correct because I can't test it without access to the base class, but this seems like it's probably the culprit. Try it out and let me know how it works.

Related

How to heapify Max-heap?

I'm trying to implement a Max-heap with with two methods insert and extract_max.
But the extract_max is currently not working correctly as it's not extracting the largest integer in the Heap, which i assume is because of heapify. I've been trying to debug for hours but can't figure out where it goes wrong. Any input would be highly appreciated.
class Heap {
int heap_array[];
int n_elems = 0;
int capacity;
// Constructor
Heap(int _capacity) {
capacity = _capacity;
heap_array = new int[capacity];
}
/**
* Private method for maintaining the heap.
* #param i, index of the element to heapify from
**/
private void heapify(int i) {
int left = 2*i + 1;
int right = 2*i+ 2;
int largest = i;
//if left ≤ heap_length[A] and A[left] > A[largest] then:
if (left <= n_elems && heap_array[left] > heap_array[largest]) {
largest = left;
//System.out.println("largest = left");
}
//if right ≤ heap_length[A] and A[right] > A[largest] then:
if (right <= n_elems && heap_array[right] > heap_array[largest]) {
//System.out.println("largest = right");
largest = right;
}
//if largest ≠ i then:
if (largest != i) {
int swap = heap_array[i];
heap_array[i] = heap_array[largest];
heap_array[largest] = swap;
// Recursively heapify the affected sub-tree
heapify(largest);
}
}
/**
* Add an element to the heap and ensure the heap property
* Throws an exception if trying to add elements to a full heap.
* #param x Element to add
*/
public void insert(int x) throws Exception {
if(is_full()) {
throw new Exception("The heap is full");
} else {
// Insert the element at end of Heap
heap_array[n_elems++] = x;
//n_elems++;
// Heapify from root
heapify(0);
}
}
public int extract_max() throws Exception {
//Get the largest
// Get the last element
int root = heap_array[0];
int lastElement = heap_array[n_elems];
// Replace root with first element
heap_array[0] = lastElement;
// Decrease size of heap by 1
n_elems--;
// heapify the root node
heapify(0);
// return new size of Heap
return root;
}
public int capacity() {
return capacity;
}
public int size() {
return n_elems;
}
public boolean is_empty() {
return n_elems == 0;
}
public boolean is_full() {
return n_elems == capacity;
}
public void print() {
for(int i = 0; i < n_elems; i++) {
System.out.println(heap_array[i]);
}
}
/**
* Remove and return largest element, and maintain the heap property.
* Throws an exception if trying to extract an element from an empty heap.
*/
/**
* For convenience, a small program to test the code.
* There are better ways of doing this kind of testing!
* #throws Exception
*
*/
static public void main(String args[]) throws Exception { // A simple test program
// Declare two heaps. Both should work nicely!
Heap h1 = new Heap(100);
Heap h2 = new Heap(10);
int data[] = {1, 4, 10, 14, 7, 9, 3, 8, 16};
//
// Insert 1 element to heap 1, and several to heap 2.
//
h2.insert(9);
h2.insert(10);
h2.insert(8);
h2.insert(11);
h2.insert(12);
h2.insert(15);
System.out.println("Size " + h2.size());
h2.print();
System.out.println("Max " + h2.extract_max());
}
}
The first problem is that your insert isn't correct. Just adding to the end and calling heapify(0) doesn't do you any good. heapify is going to examine the root element and its two children, decide that the root is the largest item, and exit, doing nothing. As a result, you're just adding things to the list sequentially.
To insert into a max-heap, you do the following:
Add the new item to the end of the heap.
Move the item up the heap to its proper position.
So insert should look like this:
public void insert(int x) throws Exception {
if(is_full()) {
throw new Exception("The heap is full");
}
// Insert the element at end of Heap
heap_array[n_elems++] = x;
// now sift it up
int current = nelems-1;
int parent = (current-1)/2;
while (current > 0 && heap_array[current] > heap_array[parent]) {
int swap = heap_array[parent];
heap_array[parent] = heap_array[current];
heap_array[current] = swap;
current = parent;
parent = (current-1)/2;
}
}
I think you also have a problem in extract_max. You have:
int lastElement = heap_array[n_elems];
But the last element is actually at index n_elems-1]. I think you want:
int lastElement = heap_array[n_elems-1];
That makes sense because if n_elems == 1, then the only item in the heap will be the root, at heap_array[0];

How to decrease the size of an array after removing some elements from it

I'm working on a project and this is one of the task:
create a method called "remove()" which can remove an element from an array.
After removing it, if the number of elements are less than 1/4 of the array, the size of the array needs to decrease by half.
For example:
I have an size 100 array with 25 elements in it. After removing one element, I will have 24 elements, and the size of my array will be 50.
Here is my code:
//First create a method decrese
private decrease() {
if (numElement < (1 / 4) * (Array.length)) {
Array[] newarray = new Array[(Array.length) / 2];
for (int i = 0; i < numElement; i++)
newarray[i] = Array[i];
Array = newarray;
}
//Then create my Remove method
public void remove(ToRemove){
if (numElement > 0) { //First check if my array is empty
for (int i = 0; i < numElement; i++) {
if (Array[i].equals(ToRemove)) {
Array[i] = Array[numElement - 1];
Array[numElement - 1] = null;
numElement--;
decrease();
}
}
//if the Array is empty, also decrease the size
decrease();
}
After some test run, my remove works fine,the Array length would never decrese no matter what size I put in.
Can some one help me.
Thanks
Also, you should just use if (numLength < (Array.length / 4)) rather then (1/4) * (Array.length); Don't do any weird casting or something like that. By default, java integer division will floor the result if that's the behavior you expect.
Also, you should be able to just use some Arrays.copyOfRange and System.arraycopy to achieve your copying needs.
https://docs.oracle.com/javase/7/docs/api/java/lang/System.html
https://docs.oracle.com/javase/7/docs/api/java/util/Arrays.html
Here's a simple code snippet that basically implement removing elements from arrays.
import java.lang.reflect.Array;
import java.util.Arrays;
public class MySpecialArray<T> {
T[] buf;
int size;
Class<T> type;
public MySpecialArray(Class<T> type, int initialBufSize) {
this.size = 0;
this.type = type;
buf = (T[]) Array.newInstance(type, initialBufSize);
}
/**
* Like arraylist add, it will basically add freely until it reaches the max length of the buffer.
* Then it has to expand the buffer. It uses buf.length * 2 + 1 to account for when an initialBufSize of 0 is
* supplied.
* #param elem
*/
public void add(T elem) {
if (this.size == this.buf.length) {
int newSize = this.buf.length * 2 + 1;
buf = Arrays.copyOf(buf, newSize);
}
this.buf[this.size] = elem;
this.size += 1;
}
public void add(T...elements) {
for(T elem : elements) {
this.add(elem);
}
}
/**
* Remove all occurrences of an element. Also reduce the max buf_size of the array if my utilized size is less than a fourth of my max buf size.
* #param removeMe element to remove all occurrences of
* #return
*/
public void remove(T removeMe) {
boolean found = false;
for(int i = 0; i < this.size; i++) {
if (buf[i].equals(removeMe)) {
System.arraycopy(buf, i+1, buf, i, this.size - i);
this.size -= 1;
if (this.size < this.buf.length / 4) {
this.buf = Arrays.copyOf(buf, this.buf.length / 2);
}
}
}
}
/**
* Remove the last element
* #return
*/
public T remove() {
if (this.size == 0) {
throw new RuntimeException("Cannot remove from empty buffer");
}
T removed = this.buf[this.size -1];
this.size -= 1;
if (this.size <= this.buf.length / 4) {
int newSize = this.buf.length / 2;
this.buf = Arrays.copyOf(this.buf, newSize);
}
return removed;
}
#Override
public String toString() {
StringBuilder sb = new StringBuilder();
for(int i = 0; i < this.size; i++) {
sb.append(this.buf[i].toString()).append(",");
}
return sb.toString();
}
public static void main(String...args) {
MySpecialArray<Integer> arr = new MySpecialArray(Integer.class, 50);
arr.add(10, 2, 4, 3, 5, 11, 9, 3, 8, 16);
System.out.println("===Pre removed===");
System.out.println(arr.buf.length);
System.out.println(arr.size);
System.out.println(arr);
arr.remove(3);
System.out.println("===After removing 3===");
System.out.println(arr.buf.length);
System.out.println(arr.size);
System.out.println(arr);
}
}
This sample, when just running it, will print out
===Pre removed===
50
10
10,2,4,3,5,11,9,3,8,16,
===After removing 3===
25
8
10,2,4,5,11,9,8,16,
Simple answer is "You can not"
Java Array data structures are of fixed size and you can't change the size of Same Array Object "once it is created".
If you need to change size you either need to copy its contents to a new Array Object or use different data structure like ArrayList which does this internally.

How do i implement heapSort on my heap?

Okay so this is one of my last assignments and of course this is creating the most stress for me but the only thing keeping me from turning this assignment in is being able to apply heapsort on the Heap that the user inputs their own integer values into an array list which is displayed and here is the code for that:
The heap propgram works fine but the Heapsort doesn't work or i can't use it or make a call for it in the HeapApp class
import java.lang.reflect.Array;
import java.util.ArrayList;
import java.util.NoSuchElementException;
import java.util.Scanner;
/**
*/
public class Heap<T extends Comparable<T>> {
private ArrayList<T> items;
public Heap() {
items = new ArrayList<T>();
}
private void siftUp() {
int k = items.size() - 1;
while (k > 0) {
int p = (k-1)/2;
T item = items.get(k);
T parent = items.get(p);
if (item.compareTo(parent) > 0) {
// swap
items.set(k, parent);
items.set(p, item);
// move up one level
k = p;
} else {
break;
}
}
}
public void insert(T item) {
items.add(item);
siftUp();
}
private void siftDown() {
int k = 0;
int l = 2*k+1;
while (l < items.size()) {
int max=l, r=l+1;
if (r < items.size()) { // there is a right child
if (items.get(r).compareTo(items.get(l)) > 0) {
max++;
}
}
if (items.get(k).compareTo(items.get(max)) < 0) {
// switch
T temp = items.get(k);
items.set(k, items.get(max));
items.set(max, temp);
k = max;
l = 2*k+1;
} else {
break;
}
}
}
public T delete()
throws NoSuchElementException {
if (items.size() == 0) {
throw new NoSuchElementException();
}
if (items.size() == 1) {
return items.remove(0);
}
T hold = items.get(0);
items.set(0, items.remove(items.size()-1));
siftDown();
return hold;
}
public int size() {
return items.size();
}
public boolean isEmpty() {
return items.isEmpty();
}
public String toString() {
return items.toString();
}
//----------------------------------------------------------------------------------------------------------------------------------------
public class Heapsort<T extends Comparable<T>> {
/**
* Sort the array a[0..n-1] by the heapsort algorithm.
*
* #param a the array to be sorted
* #param n the number of elements of a that have valid values
*/
public void sort(T[] a, int n) {
heapsort(a, n - 1);
}
/**
* Sort the ArrayList list by the heapsort algorithm.
* Works by converting the ArrayList to an array, sorting the
* array, and converting the result back to the ArrayList.
*
* #param list the ArrayList to be sorted
*/
public void sort(ArrayList<T> items) {
// Convert list to an array.
#SuppressWarnings("unchecked")
T[] a = (T[]) items.toArray((T[]) Array.newInstance(items.get(0).getClass(), items.size()));
sort(a, items.size()); // sort the array
// Copy the sorted array elements back into the list.
for (int i = 0; i < a.length; i++)
items.set(i, a[i]);
}
/**
* Sort the array a[0..lastLeaf] by the heapsort algorithm.
*
* #param items the array holding the heap
* #param lastLeaf the position of the last leaf in the array
*/
private void heapsort(T[] items, int lastLeaf) {
// First, turn the array a[0..lastLeaf] into a max-heap.
buildMaxHeap(items, lastLeaf);
// Once the array is a max-heap, repeatedly swap the root
// with the last leaf, putting the largest remaining element
// in the last leaf's position, declare this last leaf to no
// longer be in the heap, and then fix up the heap.
while (lastLeaf > 0) {
swap(items, 0, lastLeaf); // swap the root with the last leaf
lastLeaf--; // the last leaf is no longer in the heap
maxHeapify(items, 0, lastLeaf); // fix up what's left
}
}
/**
* Restore the max-heap property. When this method is called, the max-heap
* property holds everywhere, except possibly at node i and its children. When
* this method returns, the max-heap property holds everywhere.
*
* #param items the array holding the heap
* #param i index of the node that might violate the max-heap property
* #param lastLeaf the position of the last leaf in the array
*/
private void maxHeapify(T[] items, int i, int lastLeaf) {
int left = leftChild(i); // index of node i's left child
int right = rightChild(i); // index of node i's right child
int largest; // will hold the index of the node with the largest element
// among node i, left, and right
// Is there a left child and, if so, does the left child have an
// element larger than node i?
if (left <= lastLeaf && items[left].compareTo(items[i]) > 0)
largest = left; // yes, so the left child is the largest so far
else
largest = i; // no, so node i is the largest so far
// Is there a left child and, if so, does the right child have an
// element larger than the larger of node i and the left child?
if (right <= lastLeaf && items[right].compareTo(items[largest]) > 0)
largest = right; // yes, so the right child is the largest
// If node i holds an element larger than both the left and right
// children, then the max-heap property already held, and we need do
// nothing more. Otherwise, we need to swap node i with the larger
// of the two children, and then recurse down the heap from the larger
// child.
if (largest != i) {
swap(items, i, largest);
maxHeapify(items, largest, lastLeaf);
}
}
/**
* Form array a[0..lastLeaf] into a max-heap.
*
* #param items array to be heapified
* #param lastLeaf position of last valid data in a
*/
private void buildMaxHeap(T[] items, int lastLeaf) {
int lastNonLeaf = (lastLeaf - 1) / 2; // nodes lastNonLeaf+1 to lastLeaf are leaves
for (int j = lastNonLeaf; j >= 0; j--)
maxHeapify(items, j, lastLeaf);
}
/**
* Swap two locations i and j in array a.
*
* #param items the array
* #param i first position
* #param j second position
*/
private void swap(T[] items, int i, int j) {
T t = items[i];
items[i] = items[j];
items[j] = t;
}
/**
* Return the index of the left child of node i.
*
* #param i index of the parent node
* #return index of the left child of node i
*/
private int leftChild(int i) {
return 2 * i + 1;
}
/**
* Return the index of the right child of node i.
*
* #param i index of the parent node
* #return the index of the right child of node i
*/
private int rightChild(int i) {
return 2 * i + 2;
}
/**
* For debugging and testing, print out an array.
*
* #param a the array to print
* #param n number of elements of a to print
*/
public void printArray(T[] items, int n) {
for (int i = 0; i < n; i++)
System.out.println(items[i]);
}
}
}
import java.util.Scanner;
public class HeapApp{
/**
* #param args
*/
public static void main(String[] args) {
Heap<Integer> hp = new Heap<Integer>();
Scanner sc = new Scanner(System.in);
System.out.print("Enter next int, 'done' to stop: ");
String line = sc.next();
while (!line.equals("done")) {
hp.insert(Integer.parseInt(line));
System.out.println(hp);
System.out.print("Enter next int, 'done' to stop: ");
line = sc.next();
}
while (hp.isEmpty()) {
//int max = hp.delete();
System.out.println( " " + hp);
}
System.out.println(hp);
System.out.println("After sorting " + hp);
}
}
Now i'm not asking anyone to do it for me but i just need help figuring out how to get the Heapsort to work with the heap PLEASE HELP! The most i have tried is setting the parameters within the Heap sort method.
My question and code is not a duplicate for one this is based on a Heap and heapsort from the user input:
public static void main(String[] args) {
Heap<Integer> hp = new Heap<Integer>();
Scanner sc = new Scanner(System.in);
System.out.print("Enter next int, 'done' to stop: ");
String line = sc.next();
while (!line.equals("done")) {
hp.insert(Integer.parseInt(line));
System.out.println(hp);
System.out.print("Enter next int, 'done' to stop: ");
line = sc.next();
}
Also the entire Heap is implemented using an ArrayList:
public class Heap<T extends Comparable<T>> {
private ArrayList<T> items;
public Heap() {
items = new ArrayList<T>();
}
Add a sort method to your Heap class like this:
public void sort()
{
new Heapsort<T>().sort(items);
}
Then in your HeapApp class call the sort method before printing it out:
hp.sort();
System.out.println("After sorting " + hp);

Removing a element from an array list without using libraries. Java

I have implemented a function which removes an element from an array list. I should not the ArrayList libraries! See my code below:
/**
* removes a LendItem at a specified (index) position.
* This functions returns the item removed from the list or null if no such item exists. This
* function leaves no gaps, that means all items after the removed item are shifted one position.
* #param list is the item to be removed
* #param n is the index of the item to be removed
* #return the removed item
*/
public static LendItem remove(LendItemArrayList list, int n) {
if (list.next == 0) {
return null;
}
if (n < 0 || n > list.INITIAL_SIZE) {
return null;
}
LendItem itemToBeRemoved = list.lendItems[n]; // itemToBeRemoved is the item which has the index n, which we want to remove from the list.
int i;
for (i = n; i < list.next - 1; i++) { // iterate through the list, starting where the index of the itemToBeRemoved is.
list.lendItems[i] = list.lendItems[i + 1];
}
list.lendItems[i] = null;
list.next--;
return itemToBeRemoved;
}
and here is the class :
public class LendItemArrayList {
int INITIAL_SIZE = 20;
boolean resizeable = false;
LendItem[] lendItems = new LendItem[INITIAL_SIZE];
int next = 0;
}
I have tested my functions with a few test methods which have been provided, and i am only failing one of them. Specifically the test is called:
removeNonExistingElement
and it fails like this:
java.lang.AssertionError: 10 elements have been added, next should be 10 (no changes) but found 9.
EDIT:
Added the add() function.
public static boolean add(LendItemArrayList list, LendItem p) {
if (list.next == list.lendItems.length) {
if (list.resizeable == false) {
return false;
}
}
if (list.next == list.lendItems.length) {
if (list.resizeable == true) {
LendItem[] resizedList = new LendItem[list.lendItems.length*2];
for (int i = 0; i < list.next; i++) {
resizedList[i] = list.lendItems[i];
}
list.lendItems = resizedList;
}
}
list.lendItems[list.next++] = p;
return true;
}
Leave this as it is, as it checks if the indices are out of range.
if (n < 0 || n >= list.INITIAL_SIZE) {
return null;
}
Next, add this line of code:
if (list.lendItems[n] == null) {
return null;
}
Afterwards, you may or may not add the if statement which checks if the given list is empty. It makes no difference unless it is required to be used.
if (list.next == 0){
return null;
}
Change this line:
if (n < 0 || n > list.INITIAL_SIZE) {
To this:
if (n < 0 || n >= list.INITIAL_SIZE) {
>= means greater or equal. If n == list.INITIAL_SIZE, then that item can't be removed either, because since indices start with 0, the last value in a list has an index of size - 1. It's one of those things that hurt your brain when you start programming.

Find an array inside another larger array

I was recently asked to write 3 test programs for a job. They would be written using just core Java API's and any test framework of my choice. Unit tests should be implemented where appropriate.
Although I haven't received any feedback at all, I suppose they didn't like my solutions (otherwise I would have heard from them), so I decided to show my programs here and ask if this implementation can be considered good, and, if not, then why?
To avoid confusion, I'll ask only first one for now.
Implement a function that finds an
array in another larger array. It
should accept two arrays as parameters
and it will return the index of the
first array where the second array
first occurs in full. Eg,
findArray([2,3,7,1,20], [7,1]) should
return 2.
I didn't try to find any existing solution, but instead wanted to do it myself.
Possible reasons:
1. Should be static.
2. Should use line comments instead of block ones.
3. Didn't check for null values first (I know, just spotted too late).
4. ?
UPDATE:
Quite a few reasons have been presented, and it's very difficult for me to choose one answer as many answers have a good solution. As #adietrich mentioned, I tend to believe they wanted me to demonstrate knowledge of core API (they even asked to write a function, not to write an algorithm).
I believe the best way to secure the job was to provide as many solutions as possible, including:
1. Implementation using Collections.indexOfSubList() method to show that I know core collections API.
2. Implement using brute-force approach, but provide a more elegant solution.
3. Implement using a search algorithm, for example Boyer-Moore.
4. Implement using combination of System.arraycopy() and Arrays.equal(). However not the best solution in terms of performance, it would show my knowledge of standard array routines.
Thank you all for your answers!
END OF UPDATE.
Here is what I wrote:
Actual program:
package com.example.common.utils;
/**
* This class contains functions for array manipulations.
*
* #author Roman
*
*/
public class ArrayUtils {
/**
* Finds a sub array in a large array
*
* #param largeArray
* #param subArray
* #return index of sub array
*/
public int findArray(int[] largeArray, int[] subArray) {
/* If any of the arrays is empty then not found */
if (largeArray.length == 0 || subArray.length == 0) {
return -1;
}
/* If subarray is larger than large array then not found */
if (subArray.length > largeArray.length) {
return -1;
}
for (int i = 0; i < largeArray.length; i++) {
/* Check if the next element of large array is the same as the first element of subarray */
if (largeArray[i] == subArray[0]) {
boolean subArrayFound = true;
for (int j = 0; j < subArray.length; j++) {
/* If outside of large array or elements not equal then leave the loop */
if (largeArray.length <= i+j || subArray[j] != largeArray[i+j]) {
subArrayFound = false;
break;
}
}
/* Sub array found - return its index */
if (subArrayFound) {
return i;
}
}
}
/* Return default value */
return -1;
}
}
Test code:
package com.example.common.utils;
import com.example.common.utils.ArrayUtils;
import junit.framework.TestCase;
public class ArrayUtilsTest extends TestCase {
private ArrayUtils arrayUtils = new ArrayUtils();
public void testFindArrayDoesntExist() {
int[] largeArray = {1,2,3,4,5,6,7};
int[] subArray = {8,9,10};
int expected = -1;
int actual = arrayUtils.findArray(largeArray, subArray);
assertEquals(expected, actual);
}
public void testFindArrayExistSimple() {
int[] largeArray = {1,2,3,4,5,6,7};
int[] subArray = {3,4,5};
int expected = 2;
int actual = arrayUtils.findArray(largeArray, subArray);
assertEquals(expected, actual);
}
public void testFindArrayExistFirstPosition() {
int[] largeArray = {1,2,3,4,5,6,7};
int[] subArray = {1,2,3};
int expected = 0;
int actual = arrayUtils.findArray(largeArray, subArray);
assertEquals(expected, actual);
}
public void testFindArrayExistLastPosition() {
int[] largeArray = {1,2,3,4,5,6,7};
int[] subArray = {5,6,7};
int expected = 4;
int actual = arrayUtils.findArray(largeArray, subArray);
assertEquals(expected, actual);
}
public void testFindArrayDoesntExistPartiallyEqual() {
int[] largeArray = {1,2,3,4,5,6,7};
int[] subArray = {6,7,8};
int expected = -1;
int actual = arrayUtils.findArray(largeArray, subArray);
assertEquals(expected, actual);
}
public void testFindArrayExistPartiallyEqual() {
int[] largeArray = {1,2,3,1,2,3,4,5,6,7};
int[] subArray = {1,2,3,4};
int expected = 3;
int actual = arrayUtils.findArray(largeArray, subArray);
assertEquals(expected, actual);
}
public void testFindArraySubArrayEmpty() {
int[] largeArray = {1,2,3,4,5,6,7};
int[] subArray = {};
int expected = -1;
int actual = arrayUtils.findArray(largeArray, subArray);
assertEquals(expected, actual);
}
public void testFindArraySubArrayLargerThanArray() {
int[] largeArray = {1,2,3,4,5,6,7};
int[] subArray = {4,5,6,7,8,9,10,11};
int expected = -1;
int actual = arrayUtils.findArray(largeArray, subArray);
assertEquals(expected, actual);
}
public void testFindArrayExistsVeryComplex() {
int[] largeArray = {1234, 56, -345, 789, 23456, 6745};
int[] subArray = {56, -345, 789};
int expected = 1;
int actual = arrayUtils.findArray(largeArray, subArray);
assertEquals(expected, actual);
}
}
The requirement of "using just core Java API's" could also mean that they wanted to see whether you would reinvent the wheel. So in addition to your own implementation, you could give the one-line solution, just to be safe:
public static int findArray(Integer[] array, Integer[] subArray)
{
return Collections.indexOfSubList(Arrays.asList(array), Arrays.asList(subArray));
}
It may or may not be a good idea to point out that the example given contains invalid array literals.
Clean and improved code
public static int findArrayIndex(int[] subArray, int[] parentArray) {
if(subArray.length==0){
return -1;
}
int sL = subArray.length;
int l = parentArray.length - subArray.length;
int k = 0;
for (int i = 0; i < l; i++) {
if (parentArray[i] == subArray[k]) {
for (int j = 0; j < subArray.length; j++) {
if (parentArray[i + j] == subArray[j]) {
sL--;
if (sL == 0) {
return i;
}
}
}
}
}
return -1;
}
For finding an array of integers in a larger array of integers, you can use the same kind of algorithms as finding a substring in a larger string. For this there are many algorithms known (see Wikipedia). Especially the Boyer-Moore string search is efficient for large arrays. The algorithm that you are trying to implement is not very efficient (Wikipedia calls this the 'naive' implementation).
For your questions:
Yes, such a method should be static
Don't care, that's a question of taste
The null check can be included, or you should state in the JavaDoc that null values are not allowed, or JavaDoc should state that when either parameter is null a NullPointerException will be thrown.
Well, off the top of my head:
Yes, should be static.
A company complaining about that would not be worth working for.
Yeah, but what would you do? Return? Or throw an exception? It'll throw an exception the way it is already.
I think the main problem is that your code is not very elegant. Too many checks in the inner loop. Too many redundant checks.
Just raw, off the top of my head:
public int findArray(int[] largeArray, int[] subArray) {
int subArrayLength = subArray.length;
if (subArrayLength == 0) {
return -1;
}
int limit = largeArray.length - subArrayLength;
int i=0;
for (int i = 0; i <= limit; i++) {
boolean subArrayFound = true;
for (int j = 0; j < subArrayLength; j++) {
if (subArray[j] != largeArray[i+j]) {
subArrayFound = false;
break;
}
/* Sub array found - return its index */
if (subArrayFound) {
return i;
}
}
/* Return default value */
return -1;
}
You could keep that check for the first element so you don't have the overhead of setting up the boolean and the for loop for every single element in the array. Then you'd be looking at
public int findArray(int[] largeArray, int[] subArray) {
int subArrayLength = subArray.length;
if (subArrayLength == 0) {
return -1;
}
int limit = largeArray.length - subArrayLength;
for (int i = 0; i <= limit; i++) {
if (subArray[0] == largeArray[i]) {
boolean subArrayFound = true;
for (int j = 1; j < subArrayLength; j++) {
if (subArray[j] != largeArray[i+j]) {
subArrayFound = false;
break;
}
/* Sub array found - return its index */
if (subArrayFound) {
return i;
}
}
}
/* Return default value */
return -1;
}
Following is an approach using KMP pattern matching algorithm. This solution takes O(n+m). Where n = length of large array and m = length of sub array. For more information, check:
https://en.wikipedia.org/wiki/KMP_algorithm
Brute force takes O(n*m). I just checked that Collections.indexOfSubList method is also O(n*m).
public static int subStringIndex(int[] largeArray, int[] subArray) {
if (largeArray.length == 0 || subArray.length == 0){
throw new IllegalArgumentException();
}
if (subArray.length > largeArray.length){
throw new IllegalArgumentException();
}
int[] prefixArr = getPrefixArr(subArray);
int indexToReturn = -1;
for (int m = 0, s = 0; m < largeArray.length; m++) {
if (subArray[s] == largeArray[m]) {
s++;
} else {
if (s != 0) {
s = prefixArr[s - 1];
m--;
}
}
if (s == subArray.length) {
indexToReturn = m - subArray.length + 1;
break;
}
}
return indexToReturn;
}
private static int[] getPrefixArr(int[] subArray) {
int[] prefixArr = new int[subArray.length];
prefixArr[0] = 0;
for (int i = 1, j = 0; i < prefixArr.length; i++) {
while (subArray[i] != subArray[j]) {
if (j == 0) {
break;
}
j = prefixArr[j - 1];
}
if (subArray[i] == subArray[j]) {
prefixArr[i] = j + 1;
j++;
} else {
prefixArr[i] = j;
}
}
return prefixArr;
}
A little bit optimized code that was posted before:
public int findArray(byte[] largeArray, byte[] subArray) {
if (subArray.length == 0) {
return -1;
}
int limit = largeArray.length - subArray.length;
next:
for (int i = 0; i <= limit; i++) {
for (int j = 0; j < subArray.length; j++) {
if (subArray[j] != largeArray[i+j]) {
continue next;
}
}
/* Sub array found - return its index */
return i;
}
/* Return default value */
return -1;
}
int findSubArr(int[] arr,int[] subarr)
{
int lim=arr.length-subarr.length;
for(int i=0;i<=lim;i++)
{
int[] tmpArr=Arrays.copyOfRange(arr,i,i+subarr.length);
if(Arrays.equals(tmpArr,subarr))
return i; //returns starting index of sub array
}
return -1;//return -1 on finding no sub-array
}
UPDATE:
By reusing the same int array instance:
int findSubArr(int[] arr,int[] subarr)
{
int lim=arr.length-subarr.length;
int[] tmpArr=new int[subarr.length];
for(int i=0;i<=lim;i++)
{
System.arraycopy(arr,i,tmpArr,0,subarr.length);
if(Arrays.equals(tmpArr,subarr))
return i; //returns starting index of sub array
}
return -1;//return -1 on finding no sub-array
}
I would suggest the following improvements:
make the function static so that you can avoid creating an instance
the outer loop condition could be i <= largeArray.length-subArray.length, to avoid a test inside the loop
remove the test (largeArray[i] == subArray[0]) that is redundant
Here's #indexOf from String:
/**
* Code shared by String and StringBuffer to do searches. The
* source is the character array being searched, and the target
* is the string being searched for.
*
* #param source the characters being searched.
* #param sourceOffset offset of the source string.
* #param sourceCount count of the source string.
* #param target the characters being searched for.
* #param targetOffset offset of the target string.
* #param targetCount count of the target string.
* #param fromIndex the index to begin searching from.
*/
static int indexOf(char[] source, int sourceOffset, int sourceCount,
char[] target, int targetOffset, int targetCount,
int fromIndex) {
if (fromIndex >= sourceCount) {
return (targetCount == 0 ? sourceCount : -1);
}
if (fromIndex < 0) {
fromIndex = 0;
}
if (targetCount == 0) {
return fromIndex;
}
char first = target[targetOffset];
int max = sourceOffset + (sourceCount - targetCount);
for (int i = sourceOffset + fromIndex; i <= max; i++) {
/* Look for first character. */
if (source[i] != first) {
while (++i <= max && source[i] != first);
}
/* Found first character, now look at the rest of v2 */
if (i <= max) {
int j = i + 1;
int end = j + targetCount - 1;
for (int k = targetOffset + 1; j < end && source[j]
== target[k]; j++, k++);
if (j == end) {
/* Found whole string. */
return i - sourceOffset;
}
}
}
return -1;
}
First to your possible reasons:
Yes. And the class final with a private constructor.
Shouldn't use this kind of comments at all. The code should be self-explanatory.
You're basically implicitly checking for null by accessing the length field which will throw a NullPointerException. Only in the case of a largeArray.length == 0 and a subArray == null will this slip through.
More potential reasons:
The class doesn't contain any function for array manipulations, opposed to what the documentation says.
The documentation for the method is very sparse. It should state when and which exceptions are thrown (e.g. NullPointerException) and which return value to expect if the second array isn't found or if it is empty.
The code is more complex than needed.
Why is the equality of the first elements so important that it gets its own check?
In the first loop, it is assumed that the second array will be found, which is unintentional.
Unneeded variable and jump (boolean and break), further reducing legibility.
largeArray.length <= i+j is not easy to grasp. Should be checked before the loop, improving the performance along the way.
I'd swap the operands of subArray[j] != largeArray[i+j]. Seems more natural to me.
All in all too long.
The test code is lacking more edge cases (null arrays, first array empty, both arrays empty, first array contained in second array, second array contained multiple times etc.).
Why is the last test case named testFindArrayExistsVeryComplex?
What the exercise is missing is a specification of the component type of the array parameters, respectively the signature of the method. It makes a huge difference whether the component type is a primitive type or a reference type. The solution of adietrich assumes a reference type (thus could be generified as further improvement), mine assumes a primitive type (int).
So here's my shot, concentrating on the code / disregarding documentation and tests:
public final class ArrayUtils {
// main method
public static int indexOf(int[] haystack, int[] needle) {
return indexOf(haystack, needle, 0);
}
// helper methods
private static int indexOf(int[] haystack, int[] needle, int fromIndex) {
for (int i = fromIndex; i < haystack.length - needle.length; i++) {
if (containsAt(haystack, needle, i)) {
return i;
}
}
return -1;
}
private static boolean containsAt(int[] haystack, int[] needle, int offset) {
for (int i = 0; i < needle.length; i++) {
if (haystack[i + offset] != needle[i]) {
return false;
}
}
return true;
}
// prevent initialization
private ArrayUtils() {}
}
byte[] arr1 = {1, 2, 3, 4, 5, 6, 7, 7, 8, 9, 1, 3, 4, 56, 6, 7};
byte[] arr2 = {9, 1, 3};
boolean i = IsContainsSubArray(arr1, arr2);
public static boolean IsContainsSubArray(byte[] Large_Array, byte[] Sub_Array){
try {
int Large_Array_size, Sub_Array_size, k = 0;
Large_Array_size = Large_Array.length;
Sub_Array_size = Sub_Array.length;
if (Sub_Array_size > Large_Array_size) {
return false;
}
for (int i = 0; i < Large_Array_size; i++) {
if (Large_Array[i] == Sub_Array[k]) {
k++;
} else {
k = 0;
}
if (k == Sub_Array_size) {
return true;
}
}
} catch (Exception e) {
}
return false;
}
Code from Guava:
import javax.annotation.Nullable;
/**
* Ensures that an object reference passed as a parameter to the calling method is not null.
*
* #param reference an object reference
* #param errorMessage the exception message to use if the check fails; will be converted to a
* string using {#link String#valueOf(Object)}
* #return the non-null reference that was validated
* #throws NullPointerException if {#code reference} is null
*/
public static <T> T checkNotNull(T reference, #Nullable Object errorMessage) {
if (reference == null) {
throw new NullPointerException(String.valueOf(errorMessage));
}
return reference;
}
/**
* Returns the start position of the first occurrence of the specified {#code
* target} within {#code array}, or {#code -1} if there is no such occurrence.
*
* <p>More formally, returns the lowest index {#code i} such that {#code
* java.util.Arrays.copyOfRange(array, i, i + target.length)} contains exactly
* the same elements as {#code target}.
*
* #param array the array to search for the sequence {#code target}
* #param target the array to search for as a sub-sequence of {#code array}
*/
public static int indexOf(int[] array, int[] target) {
checkNotNull(array, "array");
checkNotNull(target, "target");
if (target.length == 0) {
return 0;
}
outer:
for (int i = 0; i < array.length - target.length + 1; i++) {
for (int j = 0; j < target.length; j++) {
if (array[i + j] != target[j]) {
continue outer;
}
}
return i;
}
return -1;
}
I would to do it in three ways:
Using no imports i.e. using plain Java statements.
Using JAVA core APIs - to some extent or to much extent.
Using string pattern search algorithms like KMP etc. (Probably the most optimized one.)
1,2 and 3 are all shown above in the answers. Here is approach 2 from my side:
public static void findArray(int[] array, int[] subArray) {
if (subArray.length > array.length) {
return;
}
if (array == null || subArray == null) {
return;
}
if (array.length == 0 || subArray.length == 0) {
return;
}
//Solution 1
List<Integer> master = Arrays.stream(array).boxed().collect(Collectors.toList());
List<Integer> pattern = IntStream.of(subArray).boxed().collect(Collectors.toList());
System.out.println(Collections.indexOfSubList(master, pattern));
//Solution2
for (int i = 0; i <= array.length - subArray.length; i++) {
String s = Arrays.toString(Arrays.copyOfRange(array, i, i + subArray.length));
if (s.equals(Arrays.toString(subArray))) {
System.out.println("Found at:" + i);
return;
}
}
System.out.println("Not found.");
}
Using java 8 and lambda expressions:
String[] smallArray = {"1","2","3"};
final String[] bigArray = {"0","1","2","3","4"};
boolean result = Arrays.stream(smallArray).allMatch(s -> Arrays.stream(bigArray).anyMatch(b -> b.equals(s)));
PS: is important to have finalString[] bigArray for enclosing space of lambda expression.
FYI: if the goal is simply to search wether an array y is a subset of an array x, we can use this:
val x = Array(1,2,3,4,5)
val y = Array(3,4,5)
val z = Array(3,4,8)
x.containsSlice(y) // true
x.containsSlice(z) // false

Categories