Hash by Chaining VS Double Probing - java
I'm trying to compare between Chaining and Double probing.
I need to insert 40 integers to table size 100,
when I measure the time with nanotime (in java)
I get that the Double is faster.
thats because in the Insert methood of Chaining, I create every time LinkedListEntry,
and it's add time.
how can it be that Chaining is more faster than Double probing ? (that's what i read in wikipedia)
Thanks!!
this is the code of chaining:
public class LastChain
{
int tableSize;
Node[] st;
LastChain(int size) {
tableSize = size;
st = new Node[tableSize];
for (int i = 0; i < tableSize; i++)
st[i] = null;
}
private class Node
{
int key;
Node next;
Node(int key, Node next)
{
this.key = key;
this.next = next;
}
}
public void put(Integer key)
{
int i = hash(key);
Node first=st[i];
for (Node x = st[i]; x != null; x = x.next)
if (key.equals(x.key))
{
return;
}
st[i] = new Node(key, first);
}
private int hash(int key)
{ return key%tableSize;
}
}
}
and this is the relevant code from double probing:
public class HashDouble1 {
private Integer[] hashArray;
private int arraySize;
private Integer bufItem; // for deleted items
HashDouble1(int size) {
arraySize = size;
hashArray = new Integer[arraySize];
bufItem = new Integer(-1);
}
public int hashFunc1(int key) {
return key % arraySize;
}
public int hashFunc2(int key) {
return 7 - key % 7;
}
public void insert(Integer key) {
int hashVal = hashFunc1(key); // hash the key
int stepSize = hashFunc2(key); // get step size
// until empty cell or -1
while (hashArray[hashVal] != null && hashArray[hashVal] != -1) {
hashVal += stepSize; // add the step
hashVal %= arraySize; // for wraparound
}
hashArray[hashVal] = key; // insert item
}
}
in this way the insert in Double is more faster than Chaining.
how can i fix it?
Chaining works best with high load factors. Trying using 90 strings (not a well places selection of integers) in a table of 100.
Also chaining is much easier to implement removal/delete for.
Note: In HashMap, an Entry object is created whether it is chained or not, not there is no saving there.
Java has the special "feature" Objects take up a lot of memory.
Thus, for large datasets (where this will have any relevance) double probing will be good.
But as a very first thing, please change your Integer[] into int[] -> the memory usage will be one fourth or so and the performance will jump nicely.
But always with performance questions: measure, measure, measure, as your case will always be special.
Related
Implementing Union-Find Algorithm for Kruskal's Algorithm to find Minimum Spanning Tree in Java
I am trying to solve the following Leetcode problem (https://leetcode.com/problems/connecting-cities-with-minimum-cost), and my approach is to figure out the total weight of the minimum spanning tree (MST) from the input graph using Kruskal's Algorithm using the Union-Find data structure. However, my code online passes 51/63 of the test cases, returning the incorrect result on the following test case, which is too hard to debug, since the input graph is too large. 50 [[2,1,22135],[3,1,13746],[4,3,37060],[5,2,48513],[6,3,49607],[7,1,97197],[8,2,95909],[9,2,82668],[10,2,48372],[11,4,17775],[12,2,6017],[13,1,51409],[14,2,12884],[15,7,98902],[16,14,52361],[17,8,11588],[18,12,86814],[19,17,49581],[20,4,41808],[21,11,77039],[22,10,80279],[23,16,61659],[24,12,89390],[25,24,10042],[26,12,78278],[27,15,30756],[28,6,2883],[29,8,3478],[30,7,29321],[31,12,47542],[32,20,35806],[33,3,26531],[34,12,16321],[35,27,82484],[36,7,55920],[37,24,21253],[38,23,90537],[39,7,83795],[40,36,70353],[41,34,76983],[42,14,63416],[43,15,39590],[44,9,86794],[45,3,31968],[46,19,32695],[47,17,40287],[48,1,27993],[49,12,86349],[50,11,52080],[17,27,65829],[42,45,87517],[14,23,96130],[5,50,3601],[10,17,2017],[26,44,4118],[26,29,93146],[1,9,56934],[22,43,5984],[3,22,13404],[13,28,66475],[11,14,93296],[16,44,71637],[7,37,88398],[7,29,56056],[2,34,79170],[40,44,55496],[35,46,14494],[32,34,25143],[28,36,59961],[10,49,58317],[8,38,33783],[8,28,19762],[34,41,69590],[27,37,26831],[15,23,53060],[5,11,7570],[20,42,98814],[18,34,96014],[13,43,94702],[1,46,18873],[44,45,43666],[22,40,69729],[4,25,28548],[8,46,19305],[15,22,39749],[33,48,43826],[14,15,38867],[13,22,56073],[3,46,51377],[13,15,73530],[6,36,67511],[27,38,76774],[6,21,21673],[28,49,72219],[40,50,9568],[31,37,66173],[14,29,93641],[4,40,87301],[18,46,41318],[2,8,25717],[1,7,3006],[9,22,85003],[14,45,33961],[18,28,56248],[1,31,10007],[3,24,23971],[6,28,24448],[35,39,87474],[10,50,3371],[7,18,26351],[19,41,86238],[3,8,73207],[11,34,75438],[3,47,35394],[27,32,69991],[6,40,87955],[2,18,85693],[5,37,50456],[8,20,59182],[16,38,58363],[9,39,58494],[39,43,73017],[10,15,88526],[16,23,48361],[4,28,59995],[2,3,66426],[6,17,29387],[15,38,80738],[12,43,63014],[9,11,90635],[12,20,36051],[13,25,1515],[32,40,72665],[10,40,85644],[13,40,70642],[12,24,88771],[14,46,79583],[30,49,45432],[21,34,95097],[25,48,96934],[2,35,79611],[9,26,71147],[11,37,57109],[35,36,67266],[42,43,15913],[3,30,44704],[4,32,46266],[5,10,94508],[31,39,45742],[12,25,56618],[10,45,79396],[15,28,78005],[19,32,94010],[36,46,4417],[6,35,7762],[10,13,12161],[49,50,60013],[20,23,6891],[9,50,63893],[35,43,74832],[10,24,3562],[6,8,47831],[29,32,82689],[7,47,71961],[14,41,82402],[20,33,38732],[16,26,24131],[17,34,96267],[21,46,81067],[19,47,41426],[13,24,68768],[1,25,78243],[2,27,77645],[11,25,96335],[31,45,30726],[43,44,34801],[3,42,22953],[12,23,34898],[37,43,32324],[18,44,18539],[8,13,59737],[28,37,67994],[13,14,25013],[22,41,25671],[1,6,57657],[8,11,83932],[42,48,24122],[4,15,851],[9,29,70508],[7,32,53629],[3,4,34945],[2,32,64478],[7,30,75022],[14,19,55721],[20,22,84838],[22,25,6103],[8,49,11497],[11,32,22278],[35,44,56616],[12,49,18681],[18,43,56358],[24,43,13360],[24,47,59846],[28,43,36311],[17,25,63309],[1,14,30207],[39,48,22241],[13,26,94146],[4,33,62994],[40,48,32450],[8,19,8063],[20,29,56772],[10,27,21224],[24,30,40328],[44,46,48426],[22,45,39752],[6,43,96892],[2,30,73566],[26,36,43360],[34,36,51956],[18,20,5710],[7,22,72496],[3,39,9207],[15,30,39474],[11,35,82661],[12,50,84860],[14,26,25992],[16,39,33166],[25,41,11721],[19,40,68623],[27,28,98119],[19,43,3644],[8,16,84611],[33,42,52972],[29,36,60307],[9,36,44224],[9,48,89857],[25,26,21705],[29,33,12562],[5,34,32209],[9,16,26285],[22,37,80956],[18,35,51968],[37,49,36399],[18,42,37774],[1,30,24687],[23,43,55470],[6,47,69677],[21,39,6826],[15,24,38561]] I'm having trouble understanding why my code will fail a test case, since I believe I am implementing the steps of Kruskal's Algorithm propertly: Sorting the connections in increasing order of weight. Building the MST by going through each connection in the sorted list and selecting that connection if it does not result in a cycle in the MST. Below is my Java code: class UnionFind { // parents[i] = parent node of node i. // If a node is the root node of a component, we define its parent // to be itself. int[] parents; public UnionFind(int n) { this.parents = new int[n]; for (int i = 0; i < n; i++) { this.parents[i] = i; } } // Merges two nodes into the same component. public void union(int node1, int node2) { int node1Component = find(node1); int node2Component = find(node2); this.parents[node1Component] = node2Component; } // Returns the component that a node is in. public int find(int node) { while (this.parents[node] != node) { node = this.parents[node]; } return node; } } class Solution { public int minimumCost(int n, int[][] connections) { UnionFind uf = new UnionFind(n + 1); // Sort edges by increasing cost. Arrays.sort(connections, new Comparator<int[]>() { #Override public int compare(final int[] a1, final int[] a2) { return a1[2] - a2[2]; } }); int edgeCount = 0; int connectionIndex = 0; int weight = 0; // Greedy algorithm: Choose the edge with the smallest weight // which does not form a cycle. We know that an edge between // two nodes will result in a cycle if those nodes are already // in the same component. for (int i = 0; i < connections.length; i++) { int[] connection = connections[i]; int nodeAComponent = uf.find(connection[0]); int nodeBComponent = uf.find(connection[1]); if (nodeAComponent != nodeBComponent) { weight += connection[2]; edgeCount++; } if (edgeCount == n - 1) { break; } } // MST, by definition, must have (n - 1) edges. if (edgeCount == n - 1) { return weight; } return -1; } }
As #geobreze stated, I forgot to unite the components (disjoint sets) of node A and node B. Below is the corrected code: if (nodeAComponent != nodeBComponent) { uf.union(nodeAComponent, nodeBComponent); weight += connection[2]; edgeCount++; }
Improving the speed of the HashMap
Just coded my own realisation of HashMap with open addressing, key type is int and value type is long. But it works more slowly than exicted java realisation even when i just add a new values. Whats way to make it faster? public class MyHashMap { private int maxPutedId =0; private int size; private int[] keys; private long[] values; private boolean[] change; public MyHashMap(int size){ this.size = size; keys = new int[size]; values = new long[size]; change = new boolean[size]; } public MyHashMap(){ this.size = 100000; keys = new int[size]; values = new long[size]; change = new boolean[size]; } public boolean put(int key, long value){ int k = 0; boolean search = true; for(int i = 0;i<maxPutedId+2;i++){ if(search&& !change[i] && keys[i] == 0 && values [i] == 0 ){ k=i; search = false; } if(change[i] && keys[i] == key ){ return false; } } keys[k] = key; values[k] = value; change[k] = true; maxPutedId = k; return true; } public Long get(int key) { for (int i = 0; i < size; i++) { if (change[i] && keys[i] == key) { return values[i]; } } return null; } public int size(){ int s = 0; for(boolean x: change){ if(x) s++; } return s; }}
You have not implemented a hash table; there is no hashing going on. For example, your get() method is doing a linear traversal through the key array. A hash table implementation is supposed to be able to compute the array entry where the key is most likely to be found (and will in fact be found if it exists and there were no hash collisions). A simple hash table would look like this: we first compute a hash from the key. Then we look at that slot in the table. Ideally, that's where the key will be found. However, if the key is not there, it could be due to collisions, so then we scan (assuming open addressing) looking for the key in subsequent slots - until we've looked through the whole table or found an unoccupied slot. I wrote 'get' since it seemed simpler :-) This is 'off the top of my head' code so you will need to check it carefully. Long get(int key) { int h = hash(key); // look in principal location for this key if (change[h] && keys[h] == key) return values[h]; // nope, scan table (wrapping around at the end) // and stop when we have found the key, scanned // the whole table, or met an empty slot int h0 = h; // save original position while ((h = (h+1) % size) != h0 && change[h]) if ( keys[h] == key) return values[h]; return null; } I probably should have written 'put' first to be more instructive. The hash function, for int keys, could be computed as key % size. Whether that's a good hash depends on the distribution of your keys; you want a hash that avoids collisions.
How to make a range tree implementation thread safe
I've implemented a range tree which supports updates in the form of incrementing or decrementing the count of a specific value. It can also query the number of values lower or equal to the value provided. The range tree has been tested to work in a single threaded environment, however I would like to know how to modify the implementation such that it can be updated and queried concurrently. I know a simple solution would be to synchronise methods that access this tree, but I would like to know if there are ways to make RangeTree thread safe by itself with minimal affect on performance. public class RangeTree { public static final int ROOT_NODE = 0; private int[] count; private int[] min; private int[] max; private int levels; private int lastLevelSize; public RangeTree(int maxValue) { levels = 1; lastLevelSize = 1; while (lastLevelSize <= maxValue) { levels++; lastLevelSize = lastLevelSize << 1; } int alloc = lastLevelSize * 2; count = new int[alloc]; min = new int[alloc]; max = new int[alloc]; int step = lastLevelSize; int pointer = ROOT_NODE; for (int i = 0; i < levels; i++) { int current = 0; while (current < lastLevelSize) { min[pointer] = current; max[pointer] = current + step - 1; current += step; pointer++; } step = step >> 1; } } public void register(int value) { int index = lastLevelSize - 1 + value; count[index]++; walkAndRefresh(index); } public void unregister(int value) { int index = lastLevelSize - 1 + value; count[index]--; walkAndRefresh(index); } private void walkAndRefresh(int node) { int currentNode = node; while (currentNode != ROOT_NODE) { currentNode = (currentNode - 1) >> 1; count[currentNode] = count[currentNode * 2 + 1] + count[currentNode * 2 + 2]; } } public int countLesserOrEq(int value) { return countLesserOrEq0(value, ROOT_NODE); } private int countLesserOrEq0(int value, int node) { if (max[node] <= value) { return count[node]; } else if (min[node] > value) { return 0; } return countLesserOrEq0(value, node * 2 + 1) + countLesserOrEq0(value, node * 2 + 2); } }
Louis Wasserman is right, this is a difficult question. But it may have simple solution. Depending on your updates/reads ratio and the contention for the data, it may be useful to use ReadWriteLock instead of synchronized. Another solution which may be efficient in some cases (depends on your workload) is to copy whole RangeTree object before update and then switch the reference to 'actual' RangeTree. Like it is done in CopyOnWriteArrayList. But this also violates atomic consistency agreement and leads us to eventual consistency.
Fixed-size collection that keeps top (N) values in Java
I need to keep top N(< 1000) integers while trying to add values from a big list of integers(around a million sized lazy list). I want to be try adding values to a collection but that needs to keep only the top N(highest values) integers. Is there any preferred data structure to use for this purpose ?
I'd suggest to use some sorted data structure, such as TreeSet. Before insertion, check the number of items in the set, and if it reached 1000, remove the smallest number if it's smaller than the newly added number, and add the new number. TreeSet<Integer> set = ...; public void add (int n) { if (set.size () < 1000) { set.add (n); } else { Integer first = set.first(); if (first.intValue() < n) { set.pollFirst(); set.add (n); } } }
Google Guava MinMaxPriorityQueue class. You can also use custom sorting by using a comparator (Use orderedBy(Comparator<B> comparator) method). Note: This collection is NOT a sorted collection. See javadoc Example: #Test public void test() { final int maxSize = 5; // Natural order final MinMaxPriorityQueue<Integer> queue = MinMaxPriorityQueue .maximumSize(maxSize).create(); queue.addAll(Arrays.asList(10, 30, 60, 70, 20, 80, 90, 50, 100, 40)); assertEquals(maxSize, queue.size()); assertEquals(new Integer(50), Collections.max(queue)); System.out.println(queue); } Output: [10, 50, 40, 30, 20]
One efficient solution is a slightly tweaked array-based priority queue using a binary min-heap. First N integers are simply added to the heap one by one or you can build it from array of first N integers (slightly faster). After that, compare the incoming integer with the root element (which is MIN value found so far). If the new integer is larger that that, simply replace the root with this new integer and perform down-heap operation (i.e. trickle down the new integer until both its children are smaller or it becomes a leaf). The data structure guarantees you will always have N largest integers so far with average addition time of O(log N). Here is my C# implementation, the mentioned method is named "EnqueueDown". The "EnqueueUp" is a standard enqueue operation that expands the array, adds new leaf and trickles it up. I have tested it on 1M numbers with max heap size of 1000 and it runs under 200 ms: namespace ImagingShop.Research.FastPriorityQueue { using System; using System.Collections; using System.Collections.Generic; using System.Linq; using System.Runtime.CompilerServices; public sealed class FastPriorityQueue<T> : IEnumerable<Tuple<T, float>> { private readonly int capacity; private readonly Tuple<T, float>[] nodes; private int count = 0; public FastPriorityQueue(int capacity) { this.capacity = capacity; this.nodes = new Tuple<T, float>[capacity]; } public int Capacity => this.capacity; public int Count => this.count; public T FirstNode => this.nodes[0].Item1; public float FirstPriority => this.nodes[0].Item2; public void Clear() { this.count = 0; } public bool Contains(T node) => this.nodes.Any(tuple => Equals(tuple.Item1, node)); public T Dequeue() { T nodeHead = this.nodes[0].Item1; int index = (this.count - 1); this.nodes[0] = this.nodes[index]; this.count--; DownHeap(index); return nodeHead; } public void EnqueueDown(T node, float priority) { if (this.count == this.capacity) { if (priority < this.nodes[0].Item2) { return; } this.nodes[0] = Tuple.Create(node, priority); DownHeap(0); return; } int index = this.count; this.count++; this.nodes[index] = Tuple.Create(node, priority); UpHeap(index); } public void EnqueueUp(T node, float priority) { int index = this.count; this.count++; this.nodes[index] = Tuple.Create(node, priority); UpHeap(index); } public IEnumerator<Tuple<T, float>> GetEnumerator() { for (int i = 0; i < this.count; i++) yield return this.nodes[i]; } [MethodImpl(MethodImplOptions.AggressiveInlining)] private void DownHeap(int index) { while (true) { int indexLeft = (index << 1); int indexRight = (indexLeft | 1); int indexMin = ((indexLeft < this.count) && (this.nodes[indexLeft].Item2 < this.nodes[index].Item2)) ? indexLeft : index; if ((indexRight < this.count) && (this.nodes[indexRight].Item2 < this.nodes[indexMin].Item2)) { indexMin = indexRight; } if (indexMin == index) { break; } Flip(index, indexMin); index = indexMin; } } [MethodImpl(MethodImplOptions.AggressiveInlining)] private void Flip(int indexA, int indexB) { var temp = this.nodes[indexA]; this.nodes[indexA] = this.nodes[indexB]; this.nodes[indexB] = temp; } [MethodImpl(MethodImplOptions.AggressiveInlining)] private void UpHeap(int index) { while (true) { if (index == 0) { break; } int indexParent = (index >> 1); if (this.nodes[indexParent].Item2 <= this.nodes[index].Item2) { break; } Flip(index, indexParent); index = indexParent; } } IEnumerator IEnumerable.GetEnumerator() { return GetEnumerator(); } } } The basic implementation is taken from "Cormen, Thomas H. Introduction to algorithms. MIT press, 2009."
In Java 1.7 one may use java.util.PriorityQueue. To keep the top N items you need to use reverse comparator, e.g. for integers you order them descending. In this manner the smallest number is always on top and could be removed if to many items in queue. package eu.pawelsz.example.topn; import java.util.Comparator; import java.util.PriorityQueue; public class TopN { public static <E> void add(int keep, PriorityQueue<E> priorityQueue, E element) { if (keep == priorityQueue.size()) { priorityQueue.poll(); } priorityQueue.add(element); } public static void main(String[] args) { int N = 4; PriorityQueue<Integer> topN = new PriorityQueue<>(N, new Comparator<Integer>() { #Override public int compare(Integer o1, Integer o2) { return o1 - o2; } }); add(N, topN, 1); add(N, topN, 2); add(N, topN, 3); add(N, topN, 4); System.out.println("smallest: " + topN.peek()); add(N, topN, 8); System.out.println("smallest: " + topN.peek()); add(N, topN, 5); System.out.println("smallest: " + topN.peek()); add(N, topN, 2); System.out.println("smallest: " + topN.peek()); } }
// this Keep Top Most K Instance in Queue public static <E> void add(int keep, PriorityQueue<E> priorityQueue, E element) { if(priorityQueue.size()<keep){ priorityQueue.add(element); } else if(keep == priorityQueue.size()) { priorityQueue.add(element); // size = keep +1 but Object o = (Object)topN.toArray()[k-1]; topN.remove(o); // resized to keep } }
The fastest way is likely a simple array items = new Item[N]; and a revolving cursor int cursor = 0;. The cursor points to the insertion point of the next element. To add a new element use the method put(Item newItem) { items[cursor++] = newItem; if(cursor == N) cursor = 0; } when accessing this structure you can make the last item added appear at index 0 via a small recalculation of the index, i.e. get(int index) { return items[ cursor > index ? cursor-index-1 : cursor-index-1+N ]; } (the -1 is because cursor always point at the next insertion point, i.e. cursor-1 is the last element added). Summary: put(item) will add a new item. get(0) will get the last item added, get(1) will get the second last item, etc. In case you need to take care of the case where n < N elements have been added you just need to check for null. (TreeSets will likely be slower)
Your Question is answered here: Size-limited queue that holds last N elements in Java To summerize it: No there is no data structure in the default java sdk, but Apache commons collections 4 has a CircularFifoQueue.
Memory Choke on Branch And Bound Knapsack Implementation
I wrote this implementation of the branch and bound knapsack algorithm based on the pseudo-Java code from here. Unfortunately, it's memory choking on large instances of the problem, like this. Why is this? How can I make this implementation more memory efficient? The input on the file on the link is formatted this way: numberOfItems maxWeight profitOfItem1 weightOfItem1 . . . profitOfItemN weightOfItemN // http://books.google.com/books?id=DAorddWEgl0C&pg=PA233&source=gbs_toc_r&cad=4#v=onepage&q&f=true import java.util.Comparator; import java.util.LinkedList; import java.util.PriorityQueue; class ItemComparator implements Comparator { public int compare (Object item1, Object item2){ Item i1 = (Item)item1; Item i2 = (Item)item2; if ((i1.valueWeightQuotient)<(i2.valueWeightQuotient)) return 1; if ((i2.valueWeightQuotient)<(i1.valueWeightQuotient)) return -1; else { // costWeightQuotients are equal if ((i1.weight)<(i2.weight)){ return 1; } if ((i2.weight)<(i1.weight)){ return -1; } } return 0; } } class Node { int level; int profit; int weight; double bound; } class NodeComparator implements Comparator { public int compare(Object o1, Object o2){ Node n1 = (Node)o1; Node n2 = (Node)o2; if ((n1.bound)<(n2.bound)) return 1; if ((n2.bound)<(n1.bound)) return -1; return 0; } } class Solution { long weight; long value; } public class BranchAndBound { static Solution branchAndBound2(LinkedList<Item> items, double W) { double timeStart = System.currentTimeMillis(); int n = items.size(); int [] p = new int [n]; int [] w = new int [n]; for (int i=0; i<n;i++){ p [i]= (int)items.get(i).value; w [i]= (int)items.get(i).weight; } Node u; Node v = new Node(); // tree root int maxProfit=0; int usedWeight=0; NodeComparator nc = new NodeComparator(); PriorityQueue<Node> PQ = new PriorityQueue<Node>(n,nc); v.level=-1; v.profit=0; v.weight=0; // v initialized to -1, dummy root v.bound = bound(v,W, n, w, p); PQ.add(v); while(!PQ.isEmpty()){ v=PQ.poll(); u = new Node(); if(v.bound>maxProfit){ // check if node is still promising u.level = v.level+1; // set u to the child that includes the next item u.weight = v.weight + w[u.level]; u.profit = v.profit + p[u.level]; if (u.weight <=W && u.profit > maxProfit){ maxProfit = u.profit; usedWeight = u.weight; } u.bound = bound(u, W, n, w, p); if(u.bound > maxProfit){ PQ.add(u); } u = new Node(); u.level = v.level+1; u.weight = v.weight; // set u to the child that does not include the next item u.profit = v.profit; u.bound = bound(u, W, n, w, p); if(u.bound>maxProfit) PQ.add(u); } } Solution solution = new Solution(); solution.value = maxProfit; solution.weight = usedWeight; double timeStop = System.currentTimeMillis(); double elapsedTime = timeStop - timeStart; System.out.println("* Time spent in branch and bound (milliseconds):" + elapsedTime); return solution; } static double bound(Node u, double W, int n, int [] w, int [] p){ int j=0; int k=0; int totWeight=0; double result=0; if(u.weight>=W) return 0; else { result = u.profit; totWeight = u.weight; // por esto no hace if(u.level < w.length) { j= u.level +1; } int weightSum; while ((j < n) && ((weightSum=totWeight + w[j])<=W)){ totWeight = weightSum; // grab as many items as possible result = result + p[j]; j++; } k=j; // use k for consistency with formula in text if (k<n){ result = result + ((W - totWeight) * p[k] / w[k]);// grab fraction of excluded kth item } return result; } } }
I got a slightly speedier implementation taking away all the Collection instances with generics and instead using arrays.
Not sure whether you still need insight into the algorithm or whether your tweaks have solved your problem, but with a breadth-first branch and bound algorithm like the one you've implemented there's always going to be the potential for a memory usage problem. You're hoping, of course, that you'll be able to rule out a sufficient number of branches as you go along to keep the number of nodes in your priority queue relatively small, but in the worst-case scenario you could end up with up to as many nodes as there are possible permutations of item selections in the knapsack held in memory. The worst-case scenario is, of course, highly unlikely, but for large problem instances even an average tree could end up populating your priority queue with millions of nodes. If you're going to be throwing lots of unforeseen large problem instances at your code and need the piece of mind of knowing that no matter how many branches the algorithm has to consider you'll never run out of memory, I'd consider a depth-first branch and bound algorithm, like the Horowitz-Sahni algorithm outlined in section 2.5.1 of this book: http://www.or.deis.unibo.it/knapsack.html. For some problem instances this approach will be less efficient in terms of the number of possible solutions that have to be considered before the optimal one is found, but then again for some problem instances it will be more efficient--it really depends on the structure of the tree.