No matter the size of the graph and the server I use, any time I attempt to route by the dijkstra_one_to_many algorithm, I overflow my heap. Test environment is a m3.2xlarge with 30gb of RAM and 2x80gb SSD drives.
java.lang.OutOfMemoryError: Java heap space
I've tracked down the code block that is the problem inside com.graphhopper.routing.DijkstraOneToMany in the findEndNode method:
while (true) {
visitedNodes++;
EdgeIterator iter = outEdgeExplorer.setBaseNode(currNode);
while (iter.next()) {
int adjNode = iter.getAdjNode();
int prevEdgeId = edgeIds[adjNode];
if (!accept(iter, prevEdgeId))
continue;
double tmpWeight = weighting.calcWeight(iter, false, prevEdgeId) + weights[currNode];
if (Double.isInfinite(tmpWeight))
continue;
double w = weights[adjNode];
if (w == Double.MAX_VALUE) {
parents[adjNode] = currNode;
weights[adjNode] = tmpWeight;
heap.insert_(tmpWeight, adjNode);
changedNodes.add(adjNode);
edgeIds[adjNode] = iter.getEdge();
} else if (w > tmpWeight) {
parents[adjNode] = currNode;
weights[adjNode] = tmpWeight;
heap.update_(tmpWeight, adjNode);
changedNodes.add(adjNode);
edgeIds[adjNode] = iter.getEdge();
}
}
if (heap.isEmpty() || isMaxVisitedNodesExceeded() || isWeightLimitExceeded())
return NOT_FOUND;
// calling just peek and not poll is important if the next query is cached
currNode = heap.peek_element();
if (finished())
return currNode;
heap.poll_element();
}
```
It seems to never find the end node and the internal data structure (min heap?) grows and grows and grows until I run out of heap space. Why is this happening?
I can post my config.properties as well if that is needed. Thank you Peter for putting together an awesome piece of open source software.
The DijkstraOneToMany class is currently not intended to be (easily) used from outside e.g. it is not thread safe. You could switch to a simple Dijkstra without a different finish condition to lower your memory requirements for simple cases.
That said ... there can be the following issues:
make sure that you cache the calls to DijkstraOneToMany as it creates big initial datastructures
again: use it from one thread only (e.g. via ThreadLocal)
It seems to never find the end node -> Maybe you use the QueryGraph with it? That will not really work as we create so called virtual nodes in the QueryGraph which DijkstraOneToMany does not know, instead try to pick the next tower node e.g. via avoiding QueryGraph completely or manually via an EdgeIterator
Thank you Peter for putting together an awesome piece of open source software.
It was not just me - it is a community effort :) !
Related
I have a problem which is puzzling me. I'm indexing a corpus (17 000 files) of text files, and while doing this, I'm also storing all the k-grams (k-long parts of words) for each word in a HashMap to be used later:
public void insert( String token ) {
//For example, car should result in "^c", "ca", "ar" and "r$" for a 2-gram index
// Check if token has already been seen. if it has, all the
// k-grams for it have already been added.
if (term2id.get(token) != null) {
return;
}
id2term.put(++lastTermID, token);
term2id.put(token, lastTermID);
// is word long enough? for example, "a" can be bigrammed and trigrammed but not four-grammed.
// K must be <= token.length + 2. "ab". K must be <= 4
List<KGramPostingsEntry> postings = null;
if(K > token.length() + 2) {
return;
}else if(K == token.length() + 2) {
// insert the one K-gram "^<String token>$" into index
String kgram = "^"+token+"$";
postings = index.get(kgram);
SortedSet<String> kgrams = new TreeSet<String>();
kgrams.add(kgram);
term2KGrams.put(token, kgrams);
if (postings == null) {
KGramPostingsEntry newEntry = new KGramPostingsEntry(lastTermID);
ArrayList<KGramPostingsEntry> newList = new ArrayList<KGramPostingsEntry>();
newList.add(newEntry);
index.put("^"+token+"$", newList);
}
// No need to do anything if the posting already exists, so no else clause. There is only one possible term in this case
// Return since we are done
return;
}else {
// We get here if there is more than one k-gram in our term
// insert all k-grams in token into index
int start = 0;
int end = start+K;
//add ^ and $ to token.
String wrappedToken = "^"+token+"$";
int noOfKGrams = wrappedToken.length() - end + 1;
// get K-Grams
String kGram;
int startCurr, endCurr;
SortedSet<String> kgrams = new TreeSet<String>();
for (int i=0; i<noOfKGrams; i++) {
startCurr = start + i;
endCurr = end + i;
kGram = wrappedToken.substring(startCurr, endCurr);
kgrams.add(kGram);
postings = index.get(kGram);
KGramPostingsEntry newEntry = new KGramPostingsEntry(lastTermID);
// if this k-gram has been seen before
if (postings != null) {
// Add this token to the existing postingsList.
// We can be sure that the list doesn't contain the token
// already, else we would previously have terminated the
// execution of this function.
int lastTermInPostings = postings.get(postings.size()-1).tokenID;
if (lastTermID == lastTermInPostings) {
continue;
}
postings.add(newEntry);
index.put(kGram, postings);
}
// if this k-gram has not been seen before
else {
ArrayList<KGramPostingsEntry> newList = new ArrayList<KGramPostingsEntry>();
newList.add(newEntry);
index.put(kGram, newList);
}
}
Clock c = Clock.systemDefaultZone();
long timestart = c.millis();
System.out.println(token);
term2KGrams.put(token, kgrams);
long timestop = c.millis();
System.out.printf("time taken to put: %d\n", timestop-timestart);
System.out.print("put ");
System.out.println(kgrams);
System.out.println();
}
}
The insertion into the HashMap happens on the rows term2KGrams.put(token, kgrams); (There are 2 of them in the code snippet). When indexing, everything works fine until things suddenly, at 15 000 indexed files, go bad. Everything slows down immensely, and the program doesn't finish in a reasonable time, if at all.
To try to understand this problem, I've added some prints at the end of the function. This is the output they generate:
http://soccer.org
time taken to put: 0
put [.or, //s, /so, ://, ^ht, cce, cer, er., htt, occ, org, p:/, r.o, rg$, soc, tp:, ttp]
aysos
time taken to put: 0
put [^ay, ays, os$, sos, yso]
http://www.davisayso.org/contacts.htm
time taken to put: 0
put [.da, .ht, .or, //w, /co, /ww, ://, ^ht, act, avi, ays, con, cts, dav, g/c, htm, htt, isa, nta, o.o, ont, org, p:/, rg/, s.h, say, so., tac, tm$, tp:, ts., ttp, vis, w.d, ww., www, yso]
playsoccer
time taken to put: 0
put [^pl, ays, cce, cer, er$, lay, occ, pla, soc, yso]
This looks fine to me, the putting doesn't seem to be taking long time and the k-grams (in this case trigrams) are correct.
But one can see strange behaviour in the pace at which my computer is printing this information. In beginning, everything is printing at a super high speed. But at 15 000, that speed stops, and instead, my computer starts printing a few lines at a time, which of course means that indexing the other 2000 files of the corpus will take an eternity.
Another interesting thing I observed was when doing a keyboard interrupt (ctrl+c) after it had been printing erratically and slowly as described for a while. It gave me this message:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.base/java.lang.StringLatin1.newString(StringLatin1.java:549)sahandzarrinkoub#Sahands-MBP:~/Documents/Programming/Information Retrieval/lab3 2$ sh compile_all.sh
Note: ir/PersistentHashedIndex.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Does this mean I'm out of memory? Is that the issue? If so, that's surprising, because I've been storing quite a lot of things in memory before, such as a HashMap containing the document ID's of every single word in the corpus, a HashMap containing every single word where every single k-gram appears, etc.
Please let me know what you think and what I can do to fix this problem.
To understand this, you must first understand that java does not allocate memory dynamically (or, at least, not indefinetly). The JVM is by default configured to start with a minimum heap size and a maximum heap size. When the maximum heap size would be exceeded through some allocation, you get a OutOfMemoryError
You can change the minimum and maximum heap size for your execution with the vm parameters -Xms and -Xmx respectively. An example for an execution with at least 2, but at most 4 GB would be
java -Xms2g -Xmx4g ...
You can find more options on the man page for java.
Before changing the heap memory, however, take a close look at your system resources, especially whether your system starts swapping. If your system swaps, a larger heap size may let the program run longer, but with equally bad performance. The only thing possible then would be to optimize your program in order to use less memory or to upgrade the RAM of your machine.
I'm working on a programming practice site that asked to implement a method that merges two sorted arrays. This is my solution:
public static int[] merge(int[] arrLeft, int[] arrRight){
int[] merged = new int[arrRight.length + arrLeft.length];
Queue<Integer> leftQueue = new LinkedList<>();
Queue<Integer> rightQueue = new LinkedList<>();
for(int i = 0; i < arrLeft.length ; i ++){
leftQueue.add(arrLeft[i]);
}
for(int i = 0; i < arrRight.length; i ++){
rightQueue.add(arrRight[i]);
}
int index = 0;
while (!leftQueue.isEmpty() || !rightQueue.isEmpty()){
int largerLeft = leftQueue.isEmpty() ? Integer.MAX_VALUE : leftQueue.peek();
int largerRight = rightQueue.isEmpty() ? Integer.MAX_VALUE : rightQueue.peek();
if(largerLeft > largerRight){
merged[index] = largerRight;
rightQueue.poll();
} else{
merged[index] = largerLeft;
leftQueue.poll();
}
index ++;
}
return merged;
}
But this is the official solution:
public static int[] merge(int[] arrLeft, int[] arrRight){
// Grab the lengths of the left and right arrays
int lenLeft = arrLeft.length;
int lenRight = arrRight.length;
// Create a new output array with the size = sum of the lengths of left and right
// arrays
int[] arrMerged = new int[lenLeft+lenRight];
// Maintain 3 indices, one for the left array, one for the right and one for
// the merged array
int indLeft = 0, indRight = 0, indMerged = 0;
// While neither array is empty, run a while loop to merge
// the smaller of the two elements, starting at the leftmost position of
// both arrays
while(indLeft < lenLeft && indRight < lenRight){
if(arrLeft[indLeft] < arrRight[indRight])
arrMerged[indMerged++] = arrLeft[indLeft++];
else
arrMerged[indMerged++] = arrRight[indRight++];
}
// Another while loop for when the left array still has elements left
while(indLeft < lenLeft){
arrMerged[indMerged++] = arrLeft[indLeft++];
}
// Another while loop for when the right array still has elements left
while(indRight < lenRight){
arrMerged[indMerged++] = arrRight[indRight++];
}
return arrMerged;
}
Apparently, all the other solutions by users on the site did not make use of a queue as well. I'm wondering if using a Queue is less efficient? Could I be penalized for using a queue in an interview for example?
As the question already states that the left and right input arrays are sorted, this gives you a hint that you should be able to solve the problem without requiring a data structure other than an array for the output.
In a real interview, it is likely that the interviewer will ask you to talk through your thought process while you are coding the solution. They may state that they want the solution implemented with certain constraints. It is very important to make sure that the problem is well defined before you start your coding. Ask as many questions as you can think of to constrain the problem as much as possible before starting.
When you are done implementing your solution, you could mention the time and space complexity of your implementation and suggest an alternative, more efficient solution.
For example, when describing your implementation you could talk about the following:
There is overhead when creating the queues
The big O notation / time and space complexity of your solution
You are unnecessarily iterating over every element of the left and right input array to create the queues before you do any merging
etc...
These types of interview questions are common when applying for positions at companies like Google, Microsoft, Amazon, and some tech startups. To prepare for such questions, I recommend you work through problems in books such as Cracking the Coding Interview. The book covers how to approach such problems, and the interview process for these kinds of companies.
Sorry to say but your solution with queues is horrible.
You are copying all elements to auxiliary dynamic data structures (which can be highly costly because of memory allocations), then back to the destination array.
A big "disadvantage" of merging is that it requires twice the storage space as it cannot be done in-place (or at least no the straightforward way). But you are spoiling things to a much larger extent by adding extra copies and overhead, unnecessarily.
The true solution is to copy directly from source to destination, leading to simpler and much more efficient code.
Also note that using a sentinel value (Integer.MAX_VALUE) when one of the queues is exhausted is a false good idea because it adds extra comparisons when you know the outcome in advance. It is much better to split in three loops as in the reference code.
Lastly, your solution can fail when the data happens to contain Integer.MAX_VALUE.
I'm a student and me and my team have to make a simulation of student's behaviour in a campus (like making "groups of friends") walking etc. For finding path that student has to go, I used A* algorithm (as I found out that its one of fastest path-finding algorithms). Unfortunately our simulation doesn't run fluently (it takes like 1-2 sec between successive iterations). I wanted to optimize the algorithm but I don't have any idea what I can do more. Can you guys help me out and share with me information if its possible to optimize my A* algorithm? Here goes code:
public LinkedList<Field> getPath(Field start, Field exit) {
LinkedList<Field> foundPath = new LinkedList<Field>();
LinkedList<Field> opensList= new LinkedList<Field>();
LinkedList<Field> closedList= new LinkedList<Field>();
Hashtable<Field, Integer> gscore = new Hashtable<Field, Integer>();
Hashtable<Field, Field> cameFrom = new Hashtable<Field, Field>();
Field x = new Field();
gscore.put(start, 0);
opensList.add(start);
while(!opensList.isEmpty()){
int min = -1;
//searching for minimal F score
for(Field f : opensList){
if(min==-1){
min = gscore.get(f)+getH(f,exit);
x = f;
}else{
int currf = gscore.get(f)+getH(f,exit);
if(min > currf){
min = currf;
x = f;
}
}
}
if(x == exit){
//path reconstruction
Field curr = exit;
while(curr != start){
foundPath.addFirst(curr);
curr = cameFrom.get(curr);
}
return foundPath;
}
opensList.remove(x);
closedList.add(x);
for(Field y : x.getNeighbourhood()){
if(!(y.getType()==FieldTypes.PAVEMENT ||y.getType() == FieldTypes.GRASS) || closedList.contains(y) || !(y.getStudent()==null))
{
continue;
}
int tentGScore = gscore.get(x) + getDist(x,y);
boolean distIsBetter = false;
if(!opensList.contains(y)){
opensList.add(y);
distIsBetter = true;
}else if(tentGScore < gscore.get(y)){
distIsBetter = true;
}
if(distIsBetter){
cameFrom.put(y, x);
gscore.put(y, tentGScore);
}
}
}
return foundPath;
}
private int getH(Field start, Field end){
int x;
int y;
x = start.getX()-end.getX();
y = start.getY() - end.getY();
if(x<0){
x = x* (-1);
}
if(y<0){
y = y * (-1);
}
return x+y;
}
private int getDist(Field start, Field end){
int ret = 0;
if(end.getType() == FieldTypes.PAVEMENT){
ret = 8;
}else if(start.getX() == end.getX() || start.getY() == end.getY()){
ret = 10;
}else{
ret = 14;
}
return ret;
}
//EDIT
This is what i got from jProfiler:
So getH is a bottlneck yes? Maybe remembering H score of field would be a good idea?
A linked list is not a good data structure for the open set. You have to find the node with the smallest F from it, you can either search through the list in O(n) or insert in sorted position in O(n), either way it's O(n). With a heap it's only O(log n). Updating the G score would remain O(n) (since you have to find the node first), unless you also added a HashTable from nodes to indexes in the heap.
A linked list is also not a good data structure for the closed set, where you need fast "Contains", which is O(n) in a linked list. You should use a HashSet for that.
You can optimize the problem by using a different algorithm, the following page illustrates and compares many different aglorihms and heuristics:
A*
IDA*
Djikstra
JumpPoint
...
http://qiao.github.io/PathFinding.js/visual/
From your implementation it seems that you are using naive A* algorithm. Use following way:-
A* is algorithm which is implemented using priority queue similar to BFS.
Heuristic function is evaluated at each node to define its fitness to be selected as next node to be visited.
As new node is visited its neighboring unvisited nodes are added into queue with its heuristic values as keys.
Do this till every heuristic value in the queue is less than(or greater) calculated value of goal state.
Find bottlenecks of your implementation using profiler . ex. jprofiler is easy to use
Use threads in areas where algorithm can run simultaneously.
Profile your JavaVM to run faster.
Allocate more RAM
a) As mentioned, you should use a heap in A* - either a basic binary heap or a pairing heap which should be theoretically faster.
b) In larger maps, it always happens that you need some time for the algorithm to run (i.e., when you request a path, it will simply have to take some time). What can be done is to use some local navigation algorithm (e.g., "run directly to the target") while the path computes.
c) If you have reasonable amount of locations (e.g., in a navmesh) and some time at the start of your code, why not to use Floyd-Warshall's algorithm? Using that, you can the information where to go next in O(1).
I built a new pathfinding algorithm. called Fast* or Fastaer, It is a BFS like A* but is faster and efficient than A*, the accuracy is 90% A*. please see this link for info and demo.
https://drbendanilloportfolio.wordpress.com/2015/08/14/fastaer-pathfinder/
It has a fast greedy line tracer, to make path straighter.
The demo file has it all. Check Task manager when using the demo for performance metrics. So far upon building this the profiler results of this has maximum surviving generation of 4 and low to nil GC time.
Suppose you want to find the middle node of a linked list in as efficient a way possible. The most typical "best" answer given is to maintain 2 pointers, a middle, and current. And to increment the middle pointer when the # of elements encountered is divisible by 2. Hence, we can find the middle in 1 pass. Efficient, right? Better than brute force, which involves 1 pass to the end, then 1 more pass until we reach size/2.
BUT... not so fast, why is the first method faster than the "brute force" way? In the first method, we're incrementing the middle pointer approximately size/2 times. But in the brute force way, in our 2nd pass, we're traversing the list until we reached the size/2th node. So aren't these 2 methods the same? Why is the first better than the 2nd?
//finding middle element of LinkedList in single pass
LinkedList.Node current = head;
int length = 0;
LinkedList.Node middle = head;
while(current.next() != null){
length++;
if(length%2 ==0){
middle = middle.next();
}
current = current.next();
}
if(length%2 == 1){
middle = middle.next();
}
If we modify the code to be:
while(current.next() != null){
current = current.next();
middle = middle.next();
if(current.next() != null){
current = current.next();
}
}
Now there are fewer assignments since length does not have to be incremented and I do believe this will give an identical result.
At the end of the day both solutions are O(N) so it is a micro-optimization.
As #Oleg Mikheev suggested, why can't we use Floyd's cycle-finding algorithm to find the middle element, as follows:
private int findMiddleElement() {
if (head == null)
return -1; // return -1 for empty linked list
Node temp = head;
Node oneHop, twoHop;
oneHop = twoHop = temp;
while (twoHop != null && twoHop.next != null) {
oneHop = oneHop.next;
twoHop = twoHop.next.next;
}
return oneHop.data;
}
The first answer has multiple advantages:
Since the two methods are of the same complexity O(N), any analysis on the efficiency needs to be careful, maybe involving the specific implementation and cost model. However, for the most naive implementation, the first method can save some loop variable increments.
It save you one variable's space - the two pointers v.s. the length, the counter and one pointer. Also, what if it is a huge list, and the length overflowed?
However, if you consider some specific model, then the second method might be much better. If the elements are all adjacent in memory, and the list is large enough , the cache can only hold one place of continuous memory, the first method might incur some memory access cost. At the end of the day, these two methods are mostly equivalent. Of course, the technique used in the first method is more flashy, and the thought process might be useful in other contexts.
public void middle(){
node slow=start.next;
node fast=start.next;
while(fast.next!=null)
{
slow=slow.next;
fast=fast.next.next;
}
System.out.println(slow.data);
}
10->9->8->7->6->5->4->3->2->1->
5
This is classic job interview question.
They don't want you to come with algorithm O(n), because both of them has O(n) complexity. Common person will say, there's no way to know where is middle if i don't traverse once (so traversing once to find length, and traversing 2nd time to find middle is two passes for those who interview you). They want you to think outside of box, and figure out way you mentioned which include two pointers.
So the complexity is same, but the way of thinking is different, and people who interview you want to see that.
I've implemented a simple B-Tree whichs maps longs to ints. Now I wanted to estimate the memory usage of it using the following method (applies to 32bit JVM only):
class BTreeEntry {
int entrySize;
long keys[];
int values[];
BTreeEntry children[];
boolean isLeaf;
...
/** #return used bytes */
long capacity() {
long cap = keys.length * (8 + 4) + 3 * 12 + 4 + 1;
if (!isLeaf) {
cap += children.length * 4;
for (int i = 0; i < children.length; i++) {
if (children[i] != null)
cap += children[i].capacity();
}
}
return cap;
}
}
/** #return memory usage in MB */
public int memoryUsage() {
return Math.round(rootEntry.capacity() / (1 << 20));
}
But I tried it e.g. for 7mio entries and the memoryUsage method reports much higher values than the -Xmx setting would allow! E.g. it says 1040 (MB) and I set -Xmx300! Is the JVM somehow able to optimize the memory layout, eg. for empty arrays or what could be my mistake?
Update1: Ok, introducing the isLeaf boolean reduces memory usage a lot, but still it is unclear why I observed higher values than Xmx. (You can still try out this via using isLeaf == false for all contructors)
Update2: Hmmh, something is very wrong. When increasing the entries per leaf one would assume that the memory usage decreases (when doing compact for both), because less overhead of references is involved for larger arrays (and btree has smaller height). But the method memoryUsage reports an increased value if I use 500 instead 100 entries per leaf.
Ohh sh... a bit fresh air solved this issue ;)
When an entry is full it will be splitted. In my original split method checkSplitEntry (where I wanted to avoid waste of memory) I made a big memory waste mistake:
// left child: just copy pointer and decrease size to index
BTreeEntry newLeftChild = this;
newLeftChild.entrySize = splitIndex;
The problem here is, that the old children pointers are still accessible. And so, in my memoryUsage method I'm counting some children twice (especially when I did not compact!). So, without this trick all should be fine and my B-Tree will be even more memory efficient as the garbage collector can do its work!