Improving Key Search Time Complexity in a Priority Queue Heap

Improving Key Search Time Complexity in a Priority Queue Heap - java

I have a custom Task class which contains a priority value as well as some additional fields, shown below:
class Task{
int ID;
int Priority;
int Time;
public Task(int i, int p, int t){
this.ID = i;
this.Priority = p;
this.Time = t;
}
//Getters, etc
}
These are stored in a max heap by priority, which works fine. However, if I want to find a Task object with a specific ID value, that has to be done in O(n) time due to the linear search (using a basic array of Tasks as a heap):
public int getTimeOfID(int ID){
for(int i = 1; i < heapSize+1; i++){
if (heap[i].getTaskID() == taskID){
return heap[i].getTimeLeft();
}
}
return -1;
}
I've come across several references to a "modified heap" that could be used to improve ID search to O(1) time, but haven't found a concrete implementation example. Is this possible to do, and if so, how would I do it? A Java or pseudcode example would be greatly appreciated, but even just the name of a relevant data structure to begin my search would be helpful. Thanks for any assistance.
EDIT: Additional code added as requested:
//initial variables
private Task[] heap;
private int heapSize, capacity;
int maxTasksHigh;
//Constructor
public PQ(int maxTasks){
this.capacity = maxTasks+1;
heap = new Task[this.capacity];
heapSize = 0;
maxTasksHigh = maxTasks;
}
//Addition
public void add(int ID, int time){
Task newTask = new Task(ID, time);
heapSize++;
heap[heapSize] = newTask;
int target = heapSize;
heap[target] = newTask;
reorder(target);
}
//etc.

What you can do is add a HashMap to map between an ID and the Task object in the Max Heap.
Then when adding or removing an item you add or remove it from the HashMap<String, Task>. These operations will take O(1) so will not harm the time complexity of the Max Heap. By using the HashMap in addition to the Max Heap you can check if a given ID exists and retrieve its item in O(1).
A word of caution: If you return the reference to the object in the Max Heap through these extra methods an outsider can change the value of the item and therefore break the Max Heap. Solve it by returning a deep clone of the object or by having your Task immutable.
Update after adding code:
Create a new member of the class of HashMap<String, Task> and
initialize it in the constructor.
In the add method check if a.containsKey() for the given Task. If not add it to the Max Heap and to the HashMap.
Update logic of other methods as needed.

Related

Garbage Collector overhead limit exceeded Caused By Array Creation

Hello,
My program spent a lot of Time creating Object in the Heap Memory , so at certain Time i get this Error:
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
I can't Put my Full Application in this discussion So i created A prototype to explain what My program is doing.
The part of my program that deal with Creating Object looks as Follows:
**The Calling Program :**
public class Example {
public static void main(String[] args) {
ArrayList list = new ArrayList();
for (int i = 0; i < 5000; i++) {
for (int j = 0; j < 5000; j++) {
if (i==j)
continue;
int weight = new Random().nextInt();
Edge edge = new Edge(new Vertex(i+""), new Vertex(j+""),weight);
list.add(edge);
}
}
}
}
The Class Vertex :
public class Vertex {
private String sequence ;
public Vertex() {
}
public Vertex(String seq) {
this.sequence = seq;
}
}
The Class Edge :
public class Edge {
private Vertex source;
private Vertex destination;
private int weight;
public Edge() {
}
public Edge(Vertex source, Vertex destination, int weight) {
int[][] array = new int[500][500];
//here i need the array to do some calculation
anotherClass.someCalculation(array,source,destination);
this.source = source;
this.destination = destination;
this.weight = weight;
}
}
So as you can see:
I have 5000 Vertices , i need To create 5000*5000 Edges , each edge has An array of length 500*500.
For this reason the memory allocated in the Heap Memory end at certain time,the problem as i understood from many disctions I read is that there is no guaranty that the Garbage Collector Will free memory.
So what are the solution for this problem ? normally I don't Need The Edge's Array after constructing the Edge;the array is only needed during the construction of The Edges.
Another question : How can I minimize the Memory Utilisation in my Program ? I tried To turn the int array to char array but it didn't help.
Many Thanks

Your program actually creates 2 * 5000 * 5000 vertices, i.e. a pair for each iteration of the inner (j) loop. I think that you need to first create 5k vertices and keep references to them in an array for when you are creating your edges.
I assume that the "array" variable is only needed for the calculation. If that is true, then use a local variable instead of a instance variable. That way you don't keep 5k instances of that array--just one while you complete your calculation--reducing the memory usage.
Redefinition of the Edge class:
public class Edge {
private Vertex source;
private Vertex destination;
private int weight;
//private int [][] array; // don't keep an instance copy of this array if
// if it is not needed beyond the calculation.
public Edge() {
}
public Edge(Vertex source, Vertex destination, int weight) {
int[][] array = new int[500][500]; // array will gc eligible after
// constructor completes
//here i need the array to do some calculation
anotherClass.someCalculation(array,source,destination);
this.source = source;
this.destination = destination;
this.weight = weight;
}
}

The problem with your code is due to following reason.
Everytime you create an object of Edge class
Edge edge = new Edge(new Vertex(i), new Vertex(j),weight);
the following constructor is invoked
public Edge(Vertex source, Vertex destination, int weight)
Therefore, for each object creation of Edge class the following 500 X 500 integer array is created
int[][] array = new int[500][500];
which consumes a lot of memory because you are creating around 5000 X 5000 objects of Edge class.
So, if you want to store edge weights , try to make a global integer array for it.

It's important to know that an array index reference is 4 bytes. So a two-dimensional array defined as 500, 500 is 1MB without any data being stored. You're also creating 5000 x 5000 of them. Just the arrays with no other data will be over 25TB.
A GC overhead limit reached occurs when the GC is spending most of its time garbage collecting with view results. And this would be expected since the all of the arrays being created are reachable from the main thread.
You should create and expand arrays as you need it (think of an ArrayList) or only create the arrays when you need them.
If you ABSOLUTELY need all the edges in memory you will either need to increase the heap or learn to serialize the results and load only what you need for calculations.

Android II JAVA Sorting An ArrayList of an object

First of all sorry if my English bad, its not my first language..
I'm working on and android app project, that needed to sort ArrayList of an object..so I made this method to deal with that...
Lets say that I have an object of Restaurant that will contain this data:
private String name;
private float distance ;
And I sort it using the value of the variable distance from lowest to highest:
public void sort(RArrayList<RestaurantData> datas) {
RestaurantData tmp = new RestaurantData();
int swapped;
boolean b = true;
while (b) {
swapped = 0;
for (int i = 0; i < datas.size()-1; i++) {
if (datas.get(i).getDistance() > datas.get(i+1).getDistance()) {
tmp = datas.get(i);
datas.set(i, datas.get(i+1));
datas.set(i+1, tmp);
swapped = 1;
System.err.println("Swapped happening");
}
}
if (swapped == 0) {
System.err.println("Swapped end");
break;
}
}
But when i try the program..the result of an ArrayList is still random, is there any problem with my logic to sort the ArrayList of an object..
Please Help...Thankyou..

Why not use the Collections.sort method?
Here's how you could do it in your project:
public void sort(RArrayList<RestaurantData> datas) {
Collections.sort(datas, new Comparator<RestaurantData>() {
#Override
public int compare(RestaurantData lhs, RestaurantData rhs) {
return lhs.getDistance() - rhs.getDistance();
}
});
}
The above solution is a bit "destructive" in the sense that it changes the order of the elements in the original array - datas. If that's fine for you go ahead and use it. Personally I prefer things less destructive and if you have the memory to spare (meaning your array is small) you could consider this solution which copies the array before sorting. It also assumes your RArrayList is an implementation of ArrayList or backed up by it:
public List<RestaurantData> sort(RArrayList<RestaurantData> datas) {
// Create a list with enough capacity for all elements
List<RestaurantData> newList = new RArrayList<RestaurantData>(datas.size());
Collections.copy(newList, datas);
Collections.sort(newList, new Comparator<RestaurantData>() {
#Override
public int compare(RestaurantData lhs, RestaurantData rhs) {
return lhs.getDistance() - rhs.getDistance();
}
});
return newList;
}
Another thing to consider is also to create a single instance of the Comparator used in the method, since this implementation will create one instance per call. Not sure if it's worth it though, because it will also be destroyed quite soon since the scope is local.
Here's the documentation for the Collections api
One last thing, the comparator simply needs to return a value less than 0 if the elements are in the right order, bigger than 0 if they're in the wrong order or 0 if they're the same. Therefore it seems to be that it's enough to simply subtract the distances of each restaurant. However, if this isn't the case, please implement the comparator suiting your needs.

Good way to balance readability and performance of my code?

I am writing a Java program to do a simple math problem and I can't find a good way to get both code readability and performance :
Given a List of Node class and initial value of rank ( a double value with some mathematical meaning ) for each Node.
Calculate new rank for each node based on old rank value using some algorithm. Use the newly calculated rank value as old value and repeat the calculation process for multiple times.
// this code snippet is pseudo code and does not compile
class Node {
private int property1;
....
private int propertyN;
.... //getters and setters
}
List<Node> nodes;
Map<Node,Double> rankold;
Map<Node,Double> ranknew;
void doOneIteration() {
foreach (node in nodes) {
ranknew.put(node,someAlgorithm(rankold));
}
rankold = ranknew;
ranknew = {};
}
void doCalcuation(int times) {
nodes = [...];
rankold = {...};
ranknew = {};
while (times-->0) {
doOneIteration();
}
}
This code snippet are easiest to read. It separates rank logic from the node classes so it's easy to maintain.
The problem is about performance.
Creating HashMap and Double instances uses some extra CPU and Memory resources. and someAlgorithm() is simple but the number of Nodes is large.
And I wrote another version without using map:
class Node {
private int property1;
....
private int propertyN;
private double rank1;
private double rank2;
.... //getters and setters
}
List<Node> nodes;
boolean useRank1 = true;
void doOneIteration() {
if (useRank1 ) {
someAlgorithmUseRank1(nodes));
} else {
someAlgorithmUseRank2(nodes));
}
useRank1 = !useRank1;
}
This code snippet minimize memory usage but I will copy code of someAlgorithm() twice and if I made any change to someAlgorithm() there are two block of code to edit and I think this is "not elegant".
Here's a third code snippet I considered:
boolean useRank1 = true;
class Node {
private double rank1;
private double rank2;
double getRank() { if (useRank1) return rank1; else return rank2;}
void setRank(double r) { if (useRank1) rank2=r; else rank1=r;}
}
List<Node> nodes;
void doOneIteration() {
someAlgorithm(nodes);
useRank1 = !useRank1;
}
The problem is that the if branch in the getter and setter will be called multiple times which is not necessary. There is branch prediction and other many runtime optimizations in modern CPU and compilers but I am not sure how good will they work. Another concern is that this code snippet slightly increase the memory use of Node class since Node will have a pointer to the class holding useRank1 (although compared to object header one pointer is small)
The question is that is there good way to write "good" code for my problem? (good at CPU & Mem, and easy to read / maintain)

Method that inserts an object in any point in the array in java

Without the use of any array lists or vectors or any other type of built in java data structure besides arrays, I need to write a method that inserts an animal object in any position and shifts what ever is in the current position and others to the right.
While doing this I must ensure that the collection hasn't already reached its max value.
How do I actually write this method out? My version won't compile do to errors. I cannot figure out this problem. Can someone please help me write this method using my current code and provide an explanation?
private Animal [] objects;
final int MAX_ANIMALS = 100;
public AnimalObject()
{
objects = new AnimalObject[MAX_ANIMALS];
}
public AnimalObject(Animal[]a)
{
objects = a;
}
public void addAnimal(Animal a,int position)
{
Animal [] newAnimal = new Animal[objects.length];
for(int i =0; i < position; i++)
{
newAnimal[i] = objects[i];
}
newAnimal[position] = a;
System.arraycopy(objects, position, newAnimal, position+1, objects.length - position);
for(int i = position+1; i < newData.length; i++)
{
newAnimal[i] = objects[i-1];
}

I will not attempt to fix your incomplete code, I will however offer a possible solution and a piece of advice. Let's start with the advice: do not reinvent the wheel. Unless you are doing this as an excercise, do use an ArrayList. It is made precisely to do that. If it is an excercise, I am not sure if you will gain much by looking for answers on StackOverflow. Best examine ArrayList source code :)
As for a possible solution, here is one - I have tried to mimic what you have pasted above (with minor naming tweaks for readability):
public class AnimalsArray {
final int MAX_ANIMALS = 100;
private Animal[] animals;
public AnimalsArray()
{
this.animals = new Animal[MAX_ANIMALS];
}
public AnimalsArray(Animal[]a)
{
this.animals = a;
}
public void addAnimal(Animal animal, int position) {
Animal[] newAnimals = new Animal[animals.length + 1];
for (int i = 0; i < position; i++) {
newAnimals[i] = animals[i];
}
newAnimals[position] = animal;
System.arraycopy(animals, position, newAnimals, position + 1, animals.length - position);
this.animals = newAnimals;
}
}
You will obviously want to add more methods to that (like retrieval of array's elements I guess). However keep in mind that the above code is HIGHLY inefficient. The array grows each time when an element is added!
When you want to do this right, you should:
Have an initial size of array be larger than required (empty constructor does create a maximum capacity array, but "fixed" version of your code does ignore that and allocate a new array every time an element is added anyway).
When array grows (i.e. you add an element and the array is not large enough to add a new one) you should calculate how much it should grow - an algorithm for that should be designed. When it grows you could use Arrays.copyOf method to allocate a larger array
You can get more ideas on how things should be done by examining ArrayList. Apart from being more efficient it is also generic and this cannot be said about your code (this could be fixed but again - use ArrayList!).

Can I allocate objects contiguously in java?

Assume I have a large array of relatively small objects, which I need to iterate frequently.
I would like to optimize my iteration by improving cache performance, so I would like to allocate the objects [and not the reference] contiguously on the memory, so I'll get fewer cache misses, and the overall performance could be segnificantly better.
In C++, I could just allocate an array of the objects, and it will allocate them as I wanted, but in java - when allocating an array, I only allocate the reference, and the allocation is being done one object at a time.
I am aware that if I allocate the objects "at once" [one after the other], the jvm is most likely to allocate the objects as contiguous as it can, but it might be not enough if the memory is fragmented.
My questions:
Is there a way to tell the jvm to defrag the memory just before I start allocating my objects? Will it be enough to ensure [as much as possible] that the objects will be allocated continiously?
Is there a different solution to this issue?

New objects are creating in the Eden space. The eden space is never fragmented. It is always empty after a GC.
The problem you have is when a GC is performed, object can be arranged randomly in memory or even surprisingly in the reverse order they are referenced.
A work around is to store the fields as a series of arrays. I call this a column-based table instead of a row based table.
e.g. Instead of writing
class PointCount {
double x, y;
int count;
}
PointCount[] pc = new lots of small objects.
use columns based data types.
class PointCounts {
double[] xs, ys;
int[] counts;
}
or
class PointCounts {
TDoubleArrayList xs, ys;
TIntArrayList counts;
}
The arrays themselves could be in up to three different places, but the data is otherwise always continuous. This can even be marginally more efficient if you perform operations on a subset of fields.
public int totalCount() {
int sum = 0;
// counts are continuous without anything between the values.
for(int i: counts) sum += i;
return i;
}
A solution I use is to avoid GC overhead for having large amounts of data is to use an interface to access a direct or memory mapped ByteBuffer
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
public class MyCounters {
public static void main(String... args) {
Runtime rt = Runtime.getRuntime();
long used1 = rt.totalMemory() - rt.freeMemory();
long start = System.nanoTime();
int length = 100 * 1000 * 1000;
PointCount pc = new PointCountImpl(length);
for (int i = 0; i < length; i++) {
pc.index(i);
pc.setX(i);
pc.setY(-i);
pc.setCount(1);
}
for (int i = 0; i < length; i++) {
pc.index(i);
if (pc.getX() != i) throw new AssertionError();
if (pc.getY() != -i) throw new AssertionError();
if (pc.getCount() != 1) throw new AssertionError();
}
long time = System.nanoTime() - start;
long used2 = rt.totalMemory() - rt.freeMemory();
System.out.printf("Creating an array of %,d used %,d bytes of heap and tool %.1f seconds to set and get%n",
length, (used2 - used1), time / 1e9);
}
}
interface PointCount {
// set the index of the element referred to.
public void index(int index);
public double getX();
public void setX(double x);
public double getY();
public void setY(double y);
public int getCount();
public void setCount(int count);
public void incrementCount();
}
class PointCountImpl implements PointCount {
static final int X_OFFSET = 0;
static final int Y_OFFSET = X_OFFSET + 8;
static final int COUNT_OFFSET = Y_OFFSET + 8;
static final int LENGTH = COUNT_OFFSET + 4;
final ByteBuffer buffer;
int start = 0;
PointCountImpl(int count) {
this(ByteBuffer.allocateDirect(count * LENGTH).order(ByteOrder.nativeOrder()));
}
PointCountImpl(ByteBuffer buffer) {
this.buffer = buffer;
}
#Override
public void index(int index) {
start = index * LENGTH;
}
#Override
public double getX() {
return buffer.getDouble(start + X_OFFSET);
}
#Override
public void setX(double x) {
buffer.putDouble(start + X_OFFSET, x);
}
#Override
public double getY() {
return buffer.getDouble(start + Y_OFFSET);
}
#Override
public void setY(double y) {
buffer.putDouble(start + Y_OFFSET, y);
}
#Override
public int getCount() {
return buffer.getInt(start + COUNT_OFFSET);
}
#Override
public void setCount(int count) {
buffer.putInt(start + COUNT_OFFSET, count);
}
#Override
public void incrementCount() {
setCount(getCount() + 1);
}
}
run with the -XX:-UseTLAB option (to get accurate memory allocation sizes) prints
Creating an array of 100,000,000 used 12,512 bytes of heap and took 1.8 seconds to set and get
As its off heap, it has next to no GC impact.

Sadly, there is no way of ensuring objects are created/stay at adjacent memory locations in Java.
However, objects created in sequence will most likely end up adjacent to each other (of course this depends on the actual VM implementation). I'm pretty sure that the writers of the VM are aware that locality is highly desirable and don't go out of their way to scatter objects randomly around.
The Garbage Collector will at some point probably move the objects - if your objects are short lived, that should not be an issue. For long lived objects it then depends on how the GC implements moving the survivor objects. Again, I think its reasonable that the guys writing the GC have spent some thought on the matter and will perform copies in a way that does not screw locality more than unavoidable.
There are obviously no guarantees for any of above assumptions, but since we can't do anything about it anyway, stop worring :)
The only thing you can do at the java source level is to sometimes avoid composition of objects - instead you can "inline" the state you would normally put in a composite object:
class MyThing {
int myVar;
// ... more members
// composite object
Rectangle bounds;
}
instead:
class MyThing {
int myVar;
// ... more members
// "inlined" rectangle
int x, y, width, height;
}
Of course this makes the code less readable and duplicates potentially a lot of code.
Ordering class members by access pattern seems to have a slight effect (I noticed a slight alteration in a benchmarked piece of code after I had reordered some declarations), but I've never bothered to verify if its true. But it would make sense if the VM does no reordering of members.
On the same topic it would also be nice to (from a performance view) be able to reinterpret an existing primitive array as another type (e.g. cast int[] to float[]). And while you're at it, why not whish for union members as well? I sure do.
But we'd have to give up a lot of platform and architecture independency in exchange for these possibilities.

Doesn't work that way in Java. Iteration is not a matter of increasing a pointer. There is no performance impact based on where on the heap the objects are physically stored.
If you still want to approach this in a C/C++ way, think of a Java array as an array of pointers to structs. When you loop over the array, it doesn't matter where the actual structs are allocated, you are looping over an array of pointers.
I would abandon this line of reasoning. It's not how Java works and it's also sub-optimization.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Improving Key Search Time Complexity in a Priority Queue Heap - java

Related

Garbage Collector overhead limit exceeded Caused By Array Creation

Android II JAVA Sorting An ArrayList of an object

Good way to balance readability and performance of my code?

Method that inserts an object in any point in the array in java

Can I allocate objects contiguously in java?

Categories

Resources