StackOverflow while adding elements to a BST java - java

private Node put(Node x, Float longitude, Float latitude, String place, String address) {
if (x == null)
return new Node(longitude, latitude, place, address, 1);
int cmpX = longitude.compareTo(x.longitude);
int cmpY = latitude.compareTo(x.latitude);
if (cmpX < 0 | (cmpX == 0 && cmpY < 0)) {
if (x.left == null) { x.left = new Node(longitude, latitude, place, address, x.N);}
else{x.left = put(x.left, longitude, latitude, place, address);}
} else if (cmpX >= 0) {
if (x.right == null) { x.right = new Node(longitude, latitude, place, address, x.N);}
else{x.right = put(x.right, longitude, latitude, place, address);}
}
x.N = 1 + size(x.left) + size(x.right);
return x;
}
I have this code that I'm try to use to insert into a BST and it works for the first 3000 or so elements, before causing a StackOverFlow error. How do I prevent this from happening?

The reason that you encountered a StackOverflowError is that you inserted your items in an order that was already sorted. Let's see what happens in this case. I'll use integers, even though you have more complicated objects, for simplicity in seeing what happens.
After inserting 1:
Root(1)
After inserting 2:
Root(1)
\
(2)
After inserting 3:
Root(1)
\
(2)
\
(3)
After inserting n:
Root(1)
\
(2)
\
...
\
(n)
You implemented put as a recursive method. Because of that, when you have added 3000 elements, and you're attempting to add a 3001st element, you need to recur 3000 times, and now there are 3000 copies of put on the call stack, about where you say it overflows.
You can convert put into an iterative method, using a while loop instead of recursion. This will eliminate the StackOverflowError, but it won't solve the root (so to speak) of the problem -- that your BST looks like a linked list.
You can submit your elements in a random order. It may look like this:
Root(1123)
/ \
(799) (2800)
/ \ / \
(64) (999) (1599) (2901)
It may not be properly balanced, but most likely it will not devolve into the linked list situation you have from inserting in sorted order.
You can perform rotations about a specific node when one branch gets too large compared to the other branch. If you're feeling adventurous, you can implement a red-black tree, a BST that uses rotations to keep the tree balanced "enough".

Related

Binary Tree In Order creation using Arithmetic Expression Queue

I'm trying to implement code that takes a queue, titled 'input,' that has an arithmetic expression.
The expression looks like this:
5 - a * 6 + b
This expression is stored into a queue and I have a method that checks the type of whatever I'm popping in my queue. The method is called getIt() and it returns an integer that tells if it is either an operand or operator. 1 is equal to an operand; 0 is equal to operator. So if I call queue.pop() which should return the value pop the value '5', the getIt() method call on that pop would return 1.
I have an add method that takes in a queue and is supposed to add the values of the queue into a binary tree using the in order system.
BinNode<Term> current = new BinNode<>();
BinNode<Term> parent = new BinNode<>();
current = null;
parent = null;
for (int i = 0; i < input.size(); i++) {
if (input.pop().getType() == 1) {
current = new BinNode<Term>(input.pop());
} else if (input.pop().getType() == 0) {
parent = new BinNode<Term>(input.pop());
parent.setLeft(current);
}
}
In this code, I create two nodes called current/parent. I created a loop that would iterate through the queue, check to see if a popped value would be an operand-- if yes, then I set the current node to that popped value. If it's an operator, I set that to the parent node and set the left node to current. My issue here is how I finish the tree. If the goal of the tree is to make it appear similar to
*
/ \
- +
/ \ / \
5 a 6 b
I dont know how I can get my code to reflect this.

Weighted Quick Union Find

I am taking an algorithms course where they go over weighted quick union find. I am confused about why we are concerned about the size of a tree as opposed to the depth?
When I tried writing out the code, my code looked different than the solution provided.
From my understanding, the size of the tree (total number of nodes in a tree) is not as important as the depth of the tree when it comes to the run time of the union function (lg n) because it is the depth that will determine how many look ups are needed to get to the root of a node?
Thanks
My code:
public void union(int p, int q) {
int root_p = root(p);
int root_q = root(q);
// If the two trees are not already connected, union them
if(root_p != root_q) {
// The two trees aren't connected, check which is deeper
// Attach the one that is more shallow to the deeper one
if (depth[root_p] > depth[root_q]) {
// p is deeper, point q's root to p
id[root_q] = root_p;
} else if (depth[root_q] > depth[root_p]) {
// q is deeper, point p's root to p
id[root_p] = root_q;
} else {
// They are of equal depth, point q's root to p and increment p's depth by 1
id[root_q] = root_p;
depth[root_p] += 1;
}
}
}
Solution code provided:
public void union(int p, int q) {
int rootP = find(p);
int rootQ = find(q);
if (rootP == rootQ) return;
// make smaller root point to larger one
if (size[rootP] < size[rootQ]) {
parent[rootP] = rootQ;
size[rootQ] += size[rootP];
}
else {
parent[rootQ] = rootP;
size[rootP] += size[rootQ];
}
count--;
}
You are correct that the depth (actually height) is more directly related to the run time, but using either one will result in O(log N) run time for union and find.
The proof is easy -- Given that when we begin (when all sets are disjoint), every root with height h has at least 2(h-1) nodes, this invariant is maintained by union and find operations. Therefore, if a node has size n, then its height will be at most floor(log2(n))+1
So either one will do. BUT, very soon you will learn about path compression, which makes it difficult to keep track of the height of roots, but the size will still be available. At that point you will be able to use rank, which is kind of like height, or continue to use the size. Again either one will do, but I find the size easier to reason about so I always use that.

Pointer issue with Binary Search tree when trying to print

Im writing the method to print a binary search tree in order. I figured out a way to do this but it requires either the deletion or nullify nodes as they are printed. Below is my code:
public String printKeysInOrder() {
String output = "";
if (isEmpty()) return "()";
else{
int i = 0;
while(i!=size()){
Node x = root;
int loopBreak = 0;
while(loopBreak!=1){
if(x.left != null) x = x.left;
else if (x.right != null){
output = output + " " + x.val;
x.key = null;
x = x.right;
i++;
}
else{
output = output + " " + x.val;
x.key = null;
loopBreak = 1;
}
}
i++;
}
}
return output;
}
for the tree:
_7_
/ \
_3_ 8
/ \
1 6
\ /
2 4
\
5
it should print "1 2 3 4 5 6 7 8"
the code works in a way that it favors moving left through the tree until it can no longer go left, it then stores that node's value in the string output, makes the nodes key equal to null (as so future iterations of the loop do not go down that tree) and moves right if possible or iterates back around the loop.
Although i'm having trouble making it so the node equals null, as when the code is executed (via junit test) the code doesnt recognize that null key and goes through that subtree anyway? Can anyone help me out or tell me how to make it so the x.left and x.right pointers on future iterations recognize the node as null?
You don't need to nullify or delete nodes you need a traversal algorithm.
The in-order traversal provided here should work without major modification:
http://www.javabeat.net/binary-search-tree-traversal-java/
Another object-oriented approach to this is a Visitor supplied to an in-order traversable which allows you to supply the action performed at each node whether it be printing, collecting, mapping, or something else.

How to quickly insert an element into array with duplicates after all of the equal elements?

I have an ArrayList, which contains game objects sorted by their 'Z' (float) position from lower to higher. I'm not sure if ArrayList is the best choice for it but I have come up with such a solution to find an index of insertion in a complexity faster than linear (worst case):
GameObject go = new GameObject();
int index = 0;
int start = 0, end = displayList.size(); // displayList is the ArrayList
while(end - start > 0)
{
index = (start + end) / 2;
if(go.depthZ >= displayList.get(index).depthZ)
start = index + 1;
else if(go.depthZ < displayList.get(index).depthZ)
end = index - 1;
}
while(index > 0 && go.depthZ < displayList.get(index).depthZ)
index--;
while(index < displayList.size() && go.depthZ >= displayList.get(index).depthZ)
index++;
The catch is that the element has to be inserted in a specific place in the chain of elements with equal value of depthZ - at the end of this chain. That's why I need 2 additional while loops after the binary search which I assume aren't too expensive becouse binary search gives me some approximation of this place.
Still I'm wondering if there's some better solution or some known algorithms for such problem which I haven't heard of? Maybe using different data structure than ArrayList? At the moment I ignore the worst case insertion O(n) (inserting at the begining or middle) becouse using a normal List I wouldn't be able to find an index to insert using method above.
You should try to use balanced search tree (red-black tree for example) instead of array. First you can try to use TreeMap witch uses a red-black tree inside to see if it's satisfy your requirements. Possible implementation:
Map<Float, List<Object>> map = new TreeMap<Float, List<Object>>(){
#Override
public List<Object> get(Object key) {
List<Object> list = super.get(key);
if (list == null) {
list = new ArrayList<Object>();
put((Float) key, list);
}
return list;
}
};
Example of usage:
map.get(0.5f).add("hello");
map.get(0.5f).add("world");
map.get(0.6f).add("!");
System.out.println(map);
One way to do it would to do a halving search, where the first search is half way thru your list (list.size()/2), then for the next one you can do half of that, and so on. With this exponential method, instead of having to do 4096 searches when you have 4096 objects, you only need 12 searches
sorry for the complete disregard for technical terms, I am not the best at terms :P
Unless I overlook something, your approach is essentially correct (but there's an error, see below), in the sense that your first while tries to compute the insert-index such that it will be placed after all lower OR EQUAL Z: there's correctly an equal sign in your first test (updating "start" if it yields TRUE).
Then, of course, there's no need to worry anymore about its position among equals. However, your follow-up while destroys this nice situation: the test in the first follow-up while yields always TRUE (one time) and so you move back; and then you need the second follow-up while to undo that. So, you should remove BOTH follow-up whiles and you're done...
However, there's a little problem with your first while, such that it doesn't always exactly do what the purpose is. I guess that the faulty outcomes triggered you to implement the follow-up whiles to "repair" that.
Here's the issue in your while. Suppose you have a try-index (start+end)/2 that points to a larger Z, but the one just before it has value Z. You then get into your second test (elseif) and set "end" to the position where that Z-value resides. Finally you wind up with precisely that position.
The remedy is simple: in your elseif assignment, put "end = index" (without the -1). Final remark: the test in the elseif is unnecessary, just else is sufficient.
So, all in all you get
GameObject go = new GameObject();
int index = 0;
int start = 0, end = displayList.size(); // displayList is the ArrayList
while(end - start > 0)
{
index = (start + end) / 2;
if(go.depthZ >= displayList.get(index).depthZ)
start = index + 1;
else
end = index;
}
(I hope I haven't overlooked something trivial...)
Add 1 to the least significant byte of the key (with carry); binary search for that insert position; and insert it there.
Your binary search has to be so constructed as to end at the leftmost of a sequence of duplicates, but this is trivial given an understanding of the various Binary search algorithms.

Efficient algorithm to find all the paths from A to Z?

With a set of random inputs like this (20k lines):
A B
U Z
B A
A C
Z A
K Z
A Q
D A
U K
P U
U P
B Y
Y R
Y U
C R
R Q
A D
Q Z
Find all the paths from A to Z.
A - B - Y - R - Q - Z
A - B - Y - U - Z
A - C - R - Q - Z
A - Q - Z
A - B - Y - U - K - Z
A location cannot appear more than once in the path, hence A - B - Y - U - P - U - Z is not valid.
Locations are named AAA to ZZZ (presented here as A - Z for simplicity) and the input is random in such a way that there may or may not be a location ABC, all locations may be XXX (unlikely), or there may not be a possible path at all locations are "isolated".
Initially I'd thought that this is a variation of the unweighted shortest path problem, but I find it rather different and I'm not sure how does the algorithm there apply here.
My current solution goes like this:
Pre-process the list such that we have a hashmap which points a location (left), to a list of locations (right)
Create a hashmap to keep track of "visited locations". Create a list to store "found paths".
Store X (starting-location) to the "visited locations" hashmap.
Search for X in the first hashmap, (Location A will give us (B, C, Q) in O(1) time).
For-each found location (B, C, Q), check if it is the final destination (Z). If so store it in the "found paths" list. Else if it doesn't already exist in "visited locations" hashmap, Recurl to step 3 now with that location as "X". (actual code below)
With this current solution, it takes forever to map all (not shortest) possible routes from "BKI" to "SIN" for this provided data.
I was wondering if there's a more effective (time-wise) way of doing it. Does anyone know of a better algorithm to find all the paths from an arbitrary position A to an arbitrary position Z ?
Actual Code for current solution:
import java.util.*;
import java.io.*;
public class Test {
private static HashMap<String, List<String>> left_map_rights;
public static void main(String args[]) throws Exception {
left_map_rights = new HashMap<>();
BufferedReader r = new BufferedReader(new FileReader("routes.text"));
String line;
HashMap<String, Void> lines = new HashMap<>();
while ((line = r.readLine()) != null) {
if (lines.containsKey(line)) { // ensure no duplicate lines
continue;
}
lines.put(line, null);
int space_location = line.indexOf(' ');
String left = line.substring(0, space_location);
String right = line.substring(space_location + 1);
if(left.equals(right)){ // rejects entries whereby left = right
continue;
}
List<String> rights = left_map_rights.get(left);
if (rights == null) {
rights = new ArrayList<String>();
left_map_rights.put(left, rights);
}
rights.add(right);
}
r.close();
System.out.println("start");
List<List<String>> routes = GetAllRoutes("BKI", "SIN");
System.out.println("end");
for (List<String> route : routes) {
System.out.println(route);
}
}
public static List<List<String>> GetAllRoutes(String start, String end) {
List<List<String>> routes = new ArrayList<>();
List<String> rights = left_map_rights.get(start);
if (rights != null) {
for (String right : rights) {
List<String> route = new ArrayList<>();
route.add(start);
route.add(right);
Chain(routes, route, right, end);
}
}
return routes;
}
public static void Chain(List<List<String>> routes, List<String> route, String right_most_currently, String end) {
if (right_most_currently.equals(end)) {
routes.add(route);
return;
}
List<String> rights = left_map_rights.get(right_most_currently);
if (rights != null) {
for (String right : rights) {
if (!route.contains(right)) {
List<String> new_route = new ArrayList<String>(route);
new_route.add(right);
Chain(routes, new_route, right, end);
}
}
}
}
}
As I understand your question, Dijkstras algorithm cannot be applied as is, since shortest path problem per definition finds a single path in a set of all possible paths. Your task is to find all paths per-se.
Many optimizations on Dijkstras algorithm involve cutting off search trees with higher costs. You won't be able to cut off those parts in your search, as you need all findings.
And I assume you mean all paths excluding circles.
Algorithm:
Pump network into a 2dim array 26x26 of boolean/integer. fromTo[i,j].
Set a 1/true for an existing link.
Starting from the first node trace all following nodes (search links for 1/true).
Keep visited nodes in a some structure (array/list). Since maximal
depth seems to be 26, this should be possible via recursion.
And as #soulcheck has written below, you may think about cutting of paths you have aleady visted. You may keep a list of paths towards the destination in each element of the array. Adjust the breaking condition accordingly.
Break when
visiting the end node (store the result)
when visiting a node that has been visted before (circle)
visiting a node for which you have already found all paths to the destination and merge your current path with all the existing ones from that node.
Performance wise I'd vote against using hashmaps and lists and prefer static structures.
Hmm, while re-reading the question, I realized that the name of the nodes cannot be limited to A-Z. You are writing something about 20k lines, with 26 letters, a fully connected A-Z network would require far less links. Maybe you skip recursion and static structures :)
Ok, with valid names from AAA to ZZZ an array would become far too large. So you better create a dynamic structure for the network as well. Counter question: regarding performance, what is the best data structure for a less popuplate array as my algorithm would require? I' vote for an 2 dim ArrayList. Anyone?
What you're proposing is a scheme for DFS, only with backtracking.It's correct, unless you want to permit cyclic paths (you didn't specify if you do).
There are two gotchas, though.
You have to keep an eye on nodes you already visited on current path (to eliminate cycles)
You have to know how to select next node when backtracking, so that you don't descend on the same subtree in the graph when you already visited it on the current path.
The pseudocode is more or less as follows:
getPaths(A, current_path) :
if (A is destination node): return [current_path]
for B = next-not-visited-neighbor(A) :
if (not B already on current path)
result = result + getPaths(B, current_path + B)
return result
list_of_paths = getPaths(A, [A])
which is almost what you said.
Be careful though, as finding all paths in complete graph is pretty time and memory consuming.
edit
For clarification, the algorithm has Ω(n!) time complexity in worst case, as it has to list all paths from one vertex to another in complete graph of size n, and there are at least (n-2)! paths of form <A, permutations of all nodes except A and Z, Z>. No way to make it better if only listing the result would take as much.
Your data is essentially an adjacency list which allows you to construct a tree rooted at the node corresponding to A. In order to obtain all the paths between A & Z, you can run any tree traversal algorithm.
Of course, when you're building the tree you have to ensure that you don't introduce cycles.
I would proceed recursively where I would build a list of all possible paths between all pairs of nodes.
I would start by building, for all pairs (X, Y), the list L_2(X, Y) which is the list of paths of length 2 that go from X to Y; that's trivial to build since that's the input list you are given.
Then I would build the lists L_3(X, Y), recursively, using the known lists L_2(X, Z) and L_2(Z, Y), looping over Z. For example, for (C, Q), you have to try all Z in L_2(C, Z) and L_2(Z, Q) and in this case Z can only be R and you get L_3(C, Q) = {C -> R -> Q}. For other pairs, you might have an empty L_3(X, Y), or there could be many paths of length 3 from X to Y.
However you have to be careful here when building the paths here since some of them must be rejected because they have cycles. If a path has twice the same node, it is rejected.
Then you build L_4(X, Y) for all pairs by combining all paths L_2(X, Z) and L_3(Z, Y) while looping over all possible values for Z. You still remove paths with cycles.
And so on... until you get to L_17576(X, Y).
One worry with this method is that you might run out of memory to store those lists. Note however that after having computed the L_4's, you can get rid of the L_3's, etc. Of course you don't want to delete L_3(A, Z) since those paths are valid paths from A to Z.
Implementation detail: you could put L_3(X, Y) in a 17576 x 17576 array, where the element at (X, Y) is is some structure that stores all paths between (X, Y). However if most elements are empty (no paths), you could use instead a HashMap<Pair, Set<Path>>, where Pair is just some object that stores (X, Y). It's not clear to me if most elements of L_3(X, Y) are empty, and if so, if it is also the case for L_4334(X, Y).
Thanks to #Lie Ryan for pointing out this identical question on mathoverflow. My solution is basically the one by MRA; Huang claims it's not valid, but by removing the paths with duplicate nodes, I think my solution is fine.
I guess my solution needs less computations than the brute force approach, however it requires more memory. So much so that I'm not even sure it is possible on a computer with a reasonable amount of memory.

Categories