Hash Method Issues, get method specifically - java

So, I am in a Data Structures class and we are writing code and varying methods for Hashing. I am actually having trouble with the "get" method. The tests we have run fine until the last "key9" which is asserted to return null. The for loop for some reasons exits and the keyStartIndex is instantiated again. The method is not recursive so I have no idea why this is transpiring. Code is below. Any help is greatly appreciated.
Method I am trying to complete, that is having issues.
...
public String get(String key) {
//TODO : complete the method
int keyStartIndex = (int) hashFunction(key) % items.length;
for(int i = keyStartIndex; i < items.length; i++){
if(items[i].key == hashFunction(key)){
return items[i].item;
} else if(i == items.length-1){
i=0;
continue;
}
}
return null;
}
...
All prior code in this class that applies to this method
...
import java.util.Arrays;
import jdk.internal.org.objectweb.asm.tree.analysis.Value;
class DataItem {
long key;
String item;
public DataItem(long key, String item) {
this.key = key;
this.item = item;
}
#Override
public String toString() {
return String.format("{%s:%s}", key, item);
}
}
public class HashMap {
private int size = 0;
private static final int INITIAL_SIZE = 10;
private static final int DELETED_KEY = 0;
private DataItem[] items;
public HashMap() {
items = new DataItem[INITIAL_SIZE];
}
public int size() {
return size;
}
public long hashFunction(String key) {
long hashed = 0;
for(int i = 0; i < key.length(); i++){
hashed += key.charAt(i)*(Math.pow(27, i));
}
return hashed;
}
public void put(String key, String value) throws TableIsFullException {
if (size >= items.length-1){
throw new TableIsFullException();
} else {
DataItem input = new DataItem(hashFunction(key), value);
for(int i = ((int) input.key % items.length); i < items.length; i++){
if(items[i] != null){
continue;
}else if(i == items.length - 1 && items[i] != null){
i = 0;
continue;
} else {
items[i] = input;
size++;
break;
}
}
}
}
...
----------------------------------------------And the tests that are being ran, only the last one is failing again, with "key9". I have ran debugger and it says there is a nullPointerException. Again, with break points, for some reason it leaves the for loop and processes another key, key3 to be specific. I have no idea why this is happening.
#Test
public void testGet() throws TableIsFullException {
map.put("key1", "value1");
map.put("key2", "value2");
map.put("key3", "value3");
map.put("key4", "value4");
map.put("key5", "value5");
map.put("key6", "value6");
assertEquals("value3", map.get("key3"));
assertEquals(null, map.get("key9"));
}

Your put and get methods don't implement wraparound correctly. That means that when you get a couple of hash collisions towards the end of the table, things go haywire.
Secondly, you are not handling hash collisions correctly in either method. The contract for hash is that obj1.equals(obj2) implies hash1 == hash2, but not the other way around. That means that DataItem must record the original object as well as the hash. I will assume that you've added the appropriate field, and that DataItem now has three fields names key, hash and value.
Let's start with put:
The end condition is i >= items.length, so you either have an infinite loop if you wrap around, or you never wrap around.
Since you check items[i] != null first, i == items.length - 1 && items[i] != null can never be triggered, and so you never wrap around when the end of the table is full.
You never check if the existing item matches the new key.
One way to correct put is too treat items as a circular buffer. That means that you subtract off an offset modulo items.length:
int hash = hashFunction(key);
int offset = hash % items.length);
for(int i = 0; i < items.length; i++) {
int k = (i + offset) % items.length;
if(items[k] == null) {
items[k] = new DataItem(key, hash, value);
size++;
break;
}
if(items[k].hash == hash && items[k].key.equals(key)) {
items[k].value = value;
break;
}
}
You also need to fix your size check before throwing an exception. Checking size >= items.length - 1 will throw an exception when there is one free slot available. The correct condition is
if(size >= items.length) {
Your get method suffers from the same issue with wraparound as put. It also has the problem that you're checking for hash equality and not object equality when you retrieve an object.
int hash = hashFunction(key);
int offset = hash % items.length;
for(int i = 0; i < items.length; i++) {
int k = (i + offset) % items.length
if(items[k].hash == hash && items[k].key.equals(key)){
return items[i].value;
}
}
return null;
The check items[k].item.equals(key) is critical for resolving hash collisions correctly. Notice that it is only performed when the hashes match because of short circuiting.
Try to avoid recomputing values like hash inside a loop.
This whole scheme breaks down if you try to support a remove operation. If you notice, put will stop searching for matches once it finds an empty slot. This will break down if you can create empty slots before the matching object.

The NullPointerException that you see occurs because in the get() method you write
if(items[i].key == hashFunction(key))
Now if the specific key has not been added to the HashMap (and the items array is not yet full) the entry items[i] is still null and trying to access items[i].key gives the NullPointerException that you see.
The shortest test case for leads to your problem is:
#Test
public void keyNotFound() throws TableIsFullException {
assertEquals(null, map.get("key9"));
}
Besides that, carefully read the answer of "Mad Physicist" because it addresses other design flaws in your implementation (although not this one).

Related

Sort array after deleting value

how can I sort an Array after deleting one value?
e.g. {1,2,3,4,5} --> {1,2,null,4,5} --> {1,2,4,5,null}.
public boolean removeNumber(int accountNumber) {
for(int i = 0; i < Account.length; i++) {
if(Account[i].getAccountNumber() == (accountNumber)) {
Account[i] = null;
}
Since your original array is already sorted, and assuming that only one Account can have that particular accountNumber, there is no need to sort the array again. Just locate the element to remove, then shift everything after that element down one position, and null out the newly-freed element.
I've renamed the Account field to accounts to follow Java naming conventions, to make a clearer distinction between the array (accounts) and the class (Account), both for the future developers looking at the code, and for the Java compiler.
public boolean removeNumber(int accountNumber) {
for (int i = 0; i < accounts.length; i++) {
if (accounts[i].getAccountNumber() == accountNumber) {
System.arraycopy(accounts, i + 1, accounts, i, accounts.length - i - 1);
accounts[accounts.length - 1] = null;
return true; // Found and removed
}
}
return false; // Not found, so nothing removed
}
Now, if you insist on sorting the array, you can do it like this in Java 8+:
Arrays.sort(accounts, Comparator.nullsLast(Comparator.comparing(Account::getAccountNumber)));
You can follow this way to achieve this :
You need to find the accounts that have the accountNumber you're looking for
you set them to null on the array
You sort the array with a custom Comparator :
if an element is null it has to go at the end
if both are not-null you compare their accountNumber
public boolean removeNumber(int accountNumber) {
for (int i = 0; i < Account.length; i++) {
if (Account[i].getAccountNumber() == accountNumber) {
Account[i] = null;
}
}
Arrays.sort(Account, (o, p) -> {
if (o == null)
return 1;
if (p == null)
return -1;
return Integer.compare(o.getAccountNumber(), p.getAccountNumber());
});
return true;
}
Tips :
follow naming conventions : camelCasefor attriutes ==> Account[] becomes account[]
use .equals() when you are comparing object, for an int == is right

Implementing a generic map using a hash table in Java

Post Details
In a data structures course, I was given Java source code for a "quadratic probing hash table" class and asked to implement a generic map (with get and put methods) and store the key/definition pairs in a hash table. I understand the material when reading the book but find it difficult to implement in a programming language (Java). I think part of the problem is understanding exactly what the question requires and part is deficiency in Java programming experience. I'm hoping to receive some suggestions for how I can approach problems like this and fill in whatever Java knowledge I'm missing.
Some questions I've had
What is the function of the hash table class in relation to the generic map I'm supposed to create? The hash table has several methods including get, insert, remove, rehash, etc... Is the purpose of the hash table to generate a hash value to use as a key in the map class? Are keys and definitions stored in the hash table or will they be stored in the map? What's the point of making a map if the hash table already does all of this?
Can someone help me understand how to approach problems like this? What are some references that might help me, either specifically with this question or with understanding how to effectively and methodically complete this type of exercise?
I appreciate whatever help I can get. I'm including code from the book to help illustrate the problem.
Quadratic Probing Hash Table Code From Textbook
public class QuadraticProbingHashTable<AnyType> {
public QuadraticProbingHashTable() {
this(DEFAULT_TABLE_SIZE);
}
public QuadraticProbingHashTable(int size) {
allocateArray(size);
doClear();
}
public boolean insert(AnyType x) {
int currentPos = findPos(x);
if(isActive(currentPos)) return false;
array[currentPos] = new HashEntry<>(x, true);
theSize++;
if(++occupied > array.length / 2) rehash();
return true;
}
private void rehash() {
HashEntry<AnyType>[] oldArray = array;
allocateArray(2 * oldArray.length);
occupied = 0;
theSize = 0;
for(HashEntry<AnyType> entry : oldArray)
if(entry != null && entry.isActive) insert(entry.element);
}
private int findPos(AnyType x) {
int offset = 1;
int currentPos = myhash(x);
while(array[currentPos] != null && !array[currentPos].element.equals(x)) {
currentPos += offset;
offset += 2;
if(currentPos >= array.length) currentPos -= array.length;
}
return currentPos;
}
public boolean remove(AnyType x) {
int currentPos = findPos(x);
if(isActive(currentPos)) {
array[currentPos].isActive = false;
theSize--;
return true;
} else return false;
}
public int size() {
return theSize;
}
public int capacity() {
return array.length;
}
public boolean contains(AnyType x) {
int currentPos = findPos(x);
return isActive(currentPos);
}
public AnyType get(AnyType x) {
int currentPos = findPos(x);
if(isActive(currentPos)) return array[currentPos].element;
else return null;
}
private boolean isActive(int currentPos) {
return array[currentPos] != null && array[currentPos].isActive;
}
public void makeEmpty() {
doClear( );
}
private void doClear() {
occupied = 0;
for(int i = 0; i < array.length; i++) array[i] = null;
}
private int myhash(AnyType x) {
int hashVal = x.hashCode();
hashVal %= array.length;
if(hashVal < 0) hashVal += array.length;
return hashVal;
}
private static class HashEntry<AnyType> {
public AnyType element;
public boolean isActive;
public HashEntry(AnyType e) {
this(e, true);
}
public HashEntry(AnyType e, boolean i) {
element = e;
isActive = i;
}
}
private static final int DEFAULT_TABLE_SIZE = 101;
private HashEntry<AnyType>[] array;
private int occupied;
private int theSize;
private void allocateArray(int arraySize) {
array = new HashEntry[nextPrime(arraySize)];
}
private static int nextPrime(int n) {
if(n % 2 == 0) n++;
for(; !isPrime(n); n += 2) ;
return n;
}
private static boolean isPrime( int n ) {
if(n == 2 || n == 3) return true;
if(n == 1 || n % 2 == 0) return false;
for(int i = 3; i * i <= n; i += 2)
if(n % i == 0) return false;
return true;
}
}
Map Skeleton From Textbook
class Map<KeyType,ValueType> {
public Map()
public void put(KeyType key, ValueType val)
public ValueType get(KeyType key)
public boolean isEmpty()
public void makeEmpty()
private QuadraticProbingHashTable<Entry<KeyType,ValueType>> items;
private static class Entry<KeyType,ValueType> {
KeyType key;
ValueType value;
}
}
Generally, what you're facing is a problem of implementing a given interface. The Map is the interface - the HashTable is a means of implementing it, the underlying data structure.
However, I understand your confusion as the definition of the HashTable that you were provided seems ill-suited for the job as it does not seem to have an option to use a custom key (instead always relying on the object's hash code for calculating the hash) nor does it have an option to have a custom HashEntry. As the question is specified, I would say the answer is "you can't". Generally, implementing a Map on a HashTable comes down to handling collisions - one approach, which is not very effective but usually works, is that whenever you find a collision (a case where you have differing keys but the same hashes), you rehash the entire table until the collision is no longer there. The more commonly adopted answer is having a multi-level hashtable, which basically recursively stores a hashtable (calculating a different hash function) on each level. Another method is having a hashtable of arrays - where the arrays themselves store lists of elements with the same hash - and rehashing if the number of collisions is too large. Unfortunately, neither of those solutions is directly implementable with the sample class that you were provided. Without further context, I cannot really say more, but it just seems like a badly designed exercise (this is coming from someone who does occasionally torture students with similar things).
An way of hacking this within your framework is creating a Pair type whose hashCode function just calculates key.hashCode(). This way, as a value you could store an array (and then use the array approach I mentioned above) or you could store a single element (and then use the rehash approach). In either solution, solving the collision handling is the most difficult element (you have to handle cases where the HashTable contains() your Pair, but the value part of the pair doesn't equals() the element that you want to insert.

How to add an integer to a set while iterating?

I have a set of sets of integers: Set<Set<Integer>>.
I need to add integers to the set of sets as if it were a double array. So add(2,3) would have to add integer 3 to the 2nd set.
I know a set is not very suitable for this operation but it's not my call.
The commented line below clearly does not work but it shows the intention.
My question is how to add an integer to a set while iterating?
If it's necessary to identify each set, how would one do this?
#Override
public void add(int a, int b) {
if (!isValidPair(a, b)) {
throw new IllegalStateException("!isValidPair does not hold for (a,b)");
}
Iterator<Set<Integer>> it = relation.iterator();
int i = 0;
while (it.hasNext() && i <= a) {
//it.next().add(b);
i++;
}
}
One fundamental things you should be aware of, for which makes all existing answer in this question not working:
Once an object is added in a Set (similarly, as key in Map), it is not supposed to change (at least not in aspects that will change its equals() and hashCode()). The "Uniqueness" checking is done only when you add the object into the Set.
For example
Set<Set<Integer>> bigSet = new HashSet<>();
Set<Integer> v1 = new HashSet<>(Arrays.asList(1,2));
bigSet.add(v1);
System.out.println("contains " + bigSet.contains(new HashSet<>(Arrays.asList(1,2)))); // True
v1.add(3);
System.out.println("contains " + bigSet.contains(new HashSet<>(Arrays.asList(1,2)))); // False!!
System.out.println("contains " + bigSet.contains(new HashSet<>(Arrays.asList(1,2,3)))); // False!!
You can see how the set is corrupted. It contains a [1,2,3] but contains() does not work, neither for [1,2] nor [1,2,3].
Another fundamental thing is, your so-called '2nd set' may not make sense. Set implementation like HashSet maintain the values in arbitrary order.
So, with these in mind, what you may do is:
First find the n-th value, and remove it
add the value into the removed value set
re-add the value set.
Something like this (pseudo code again):
int i = 0;
Set<Integer> setToAdd = null;
for (Iterator itr = bigSet.iterator; itr.hasNext(); ++i) {
Set<Integer> s = itr.next();
if (i == inputIndex) {
// remove the set first
itr.remove();
setToAdd = s;
break;
}
}
// update the set and re-add it back
if (setToAdd != null) {
setToAdd.add(inputNumber);
bigSet.add(setToAdd);
}
Use a for-each loop and make your life easier.
public boolean add(int index, int value) {
// because a and b suck as variable names
if (index < 0 || index >= values.size()) {
return false;
}
int iter = 0;
for (Set<Integer> values : relation) {
if (iter++ == index) {
return values.add(value):
}
}
return false;
}
Now all you have to figure out is what to do if relation is unordered, as a Set or a relation are, because in that case a different Set<Integer> could match the same index each time the loop executes.
Use can use Iterators of Guava library like this :
#Override
public void add(int a, int b) {
if (!isValidPair(a, b)) {
throw new IllegalStateException("!isValidPair does not hold for (a,b)");
}
Iterators.get(relation.iterator(), a).add(b);
}
Edit : without Guava:
Iterator<Set<Integer>> iterator = relation.iterator();
for(int i = 0; i < a && iterator.hasNext(); ++i) {
iterator.next();
}
if(iterator.hasNext()) {
iterator.next().add(b);
}

Implementing Linear probing Java

I am trying to implement the linear probing method. Right now, I have come until this point:
public class LinearProbing<Key, Value> {
private int size = 300001;
private Value[] value = (Value[]) new Object[size];
private Key[] key = (Key[]) new Object[size];
public Value put(Key thiskey, Value thisval) {
int hash = thiskey.hashCode();
for (int i = hash; key[i] != null; i = (i + 1) % size) {
if(key[i] == hash)
break;
key[i] = thiskey;
value[i] = thisval;
}
}
}
I am bit confused to proceed after this. Here are my doubts:
When I check the equality of this.key == hash, I get an error saying I can't compare Key and int.So I decided to return the Object from the hashcode method and compare it with the this.key[i].equals(hashObject). However doing this is against as hashcode method returns the int in the Javadoc and I want to keep it that way. How do I solve this?
Please let me know if I wasn't clear.

Filtering and transforming a collection using Google Guava

Imagine the following object
class Trip {
String name;
int numOfTravellers;
DateMidnight from;
DateMidnight too;
}
I have written a manual recursive filter and transform method in java. However, I think this could be written more eloquently using Google Guava.
Can someone help me out and tell me how I can rewrite this to make more readable?
Basically what this method does, is locating equal entries, and combining the ones that are equal by altering the date fields
List<Trip> combineEqual(List<Trip> list) {
int n = list.size() - 1;
for (int i = n; i >= 0; i--) {
for (int j = n; j >= 0; j--) {
if (i == j) {
continue;
}
if (shouldCombineEqual(list.get(i), list.get(j))) {
Trip combined = combine(list.get(i), list.get(j));
list.remove(i);
list.remove(j);
list.add(Math.min(i, j), combined);
return combineEqual(liste);
}
}
}
return list;
}
private boolean shouldCombineEqual(Trip a, Trip b) {
return shouldCombineWith(a, b) || shouldCombineWith(b, a);
}
private boolean shouldCombineWith(Trip a, Trip b) {
return a.too() != null
&& a.too().plusDays(1).equals(b.from)
&& areEqual(a, b);
}
private boolean areEqual(Trip a, Trip b) {
return equal(a.name,b.name) && equal(a.numOfTravellers, b.numOfTravellers);
}
private boolean equal(Object a, Object b) {
return a == null && b == null || a != null && a.equals(b);
}
private Trip combineEqual(Trip a, Trip b) {
Trip copy = copy(a); //Just a copy method
if (a.from.isAfter(b.from)) {
Trip tmp = a;
a = b;
b = tmp;
} // a is now the one with the earliest too date
copy.from = a.from;
copy.too = b.too;
return copy;
}
I don't think Guava can help much here. There's a lot you can improve without it:
Create a TripKey {String name; int numOfTravellers;}, define equals, and use it instead of your misnamed areEqual. Split your trips into lists by their keys - here ListMultimap<TripKey, Trip> can help.
For each key, sort the corresponding list according to from. Try to combine each trip with all following trips. If it gets combined, restart the inner loop only. This should be already much clearer (and faster) than your solution... so I stop here.
I'd just use a HashSet.
First define equals and hashcode in your trip object. Add the first list to the set. Then iterate through the second list checking if a matching trip is already in the set. Something like:
public static Set<Trip> combineEquals(List<Trip> 11, List<Trip> 12) {
Set<Trip> trips = new HashSet<>(11);
for ( Trip t: 12) {
if ( trips.contains(t)) {
// combine whats in the set with t
} else {
trips.add(t);
}
}
return trips;

Categories