Debugging BFS tree travesal algorithm - java

I'm working alone on this project and could use another set of eyes to look at this to see what I am doing wrong. The first loop runs infinitely.
public void bfs(String start)
{
//Initial Case
add_queue.add(start);
graph.visit(start);
Iterator<String> neighbors;
String neighbor;
while(!add_queue.empty())
{
neighbors = graph.neighbors(start);
neighbor = neighbors.next();
graph.visit(neighbor);
add_queue.add(neighbor);
while(neighbors.hasNext())
{
neighbor = neighbors.next();
if(!graph.isVisited(neighbor)) //If vertex is not visited it is new and is added to the queue
{
add_queue.add(neighbor);
graph.visit(neighbor);
}
}
start = add_queue.remove();
remove_queue.add(start); //transfers vertex from add_queue to remove queue so that the order that the vertices were traversed is stored in memory
}
}

I think you are adding the first vertex of neighbours without checking if it's already visited.. here:
neighbor = neighbors.next(); <- you get first
graph.visit(neighbor); <- you visit
add_queue.add(neighbor); <- you add it without any check
while(neighbors.hasNext())
{
neighbor = neighbors.next();
if(!graph.isVisited(neighbor)) <- you do check for the others
{
add_queue.add(neighbor);
graph.visit(neighbor);
}
}
This means that you will never empty that queue.. since it starts with a size of 1, then you remove 1 element on each iteration but you add at least 1 element (you never add noone).

What's add_queue's definition of empty()?
It could be a bad naming issue, but it sounds like empty() does something, not just checks whether it is empty (which would be probably called isEmpty()).
Also, it looks like you always add at least 1 to add_queue in each outer loop (right before the inner while), but only remove one item from add_queue per iteration.

A few places to investigate:
Check to make sure that graph.isVisited() is actually recognizing when a node has been visited via graph.visit().
Is graph.neighbor(start) truly returning start's neighbors? And not including start in this list?

Your code is a little unclear. What exactly does graph.neighbors return?
In general to do a BFS you want to add the children of the current node to the queue, not the neighbors of it. Since it's all going into a queue this will ensure that you visit each node in the tree in the correct order. Assuming that it's a tree and not a general graph, this will also ensure that you don't visit a node more than once, allowing you to remove the checks to isVisited.
So, get the next node out of the queue, add all of it's children to the queue, visit the node, and repeat, until the queue is empty.

Related

Does a partial traversal of a linked-list count as "one pass" of the list?

I've been going through algorithm challenges on LeetCode and just completed "Remove Nth Node From End of List".
Many of the top answers claimed to have found a "one pass" solution and I've included a Java example below.
Please could someone explain why "while(n-->0) h2=h2.next;" doesn't count as an extra pass of the linked list and, therefore, make this a "two pass" solution?
public ListNode RemoveNthFromEnd(ListNode head, int n) {
ListNode h1=head, h2=head;
while(n-->0) h2=h2.next;
if(h2==null)return head.next; // The head need to be removed, do it.
h2=h2.next;
while(h2!=null){
h1=h1.next;
h2=h2.next;
}
h1.next=h1.next.next; // the one after the h1 need to be removed
return head;
}
I've looked in the comments to this and other solutions and couldn't find an answer. Equally, a general Google search didn't yield an explanation.
Thanks in advance.
No, it's not one-pass. One-pass is defined with respect to a sequential I/O mechanism (canonically a tape) and means that each piece of data is read at most once, in order. Analogizing the linked list to the tape here, this algorithm is not one-pass because in general, some node will have its next field read once by h2=h2.next (in either loop) and again by h1=h1.next in the second loop.
The algorithm is not single pass, but not because of the first loop.
The first loop performs a partial pass on n elements.
The second loop performs two simultaneous partial passes on l-n elements (that on h2 being complementary to that in the first loop). In total, 2l-n lookups of next fields.
A single-pass solution can be implemented with the help of a FIFO queue of length n, but this is "hiding" a partial pass.

How to construct a unordered binary tree from text file

Dear Friends I am an intermediate java user. I am stuck in a following problem. I want to construct a unordered binary tree [or general tree having at most two nodes] from a multi line (let say 40 lines) text file. The text file is then divided into two halfs; let say 20:20 lines. Then for each half a specific (let say hash) value is calculated and stored in the root node. So each node contains four elements. Two pointers to the two children (left and right) and two hashes of the two halfs of the original file. Next for each half (20 lines) the process is repeated until at each leaf we have a single line of text. Let the node have
public class BinaryTree {
private BinaryTreeNode leftNode, rightNode;
private String leftHash,rightHash;
}
I need help for writing the tree construction and searching functions. Well searching is performed by entering a line. Then hash code is created for this query line and compared against the two hashes saved at each node. If the hash of query line is close to leftHas then leftNode is accessed and if the hash of query line is close to rightHash then rightNode is accessed. The process continues until an exact hash is found.
I just need the tree construction and search teachnique. The hash comparison etc are not a problem
You'll need to start by reading the file into a string.
The first character in the string could be used as the root. Root + 1 would be the left, root + 2 would be the right
Consider left node of the root (Root + 1), you could also consider this as Root + N. Meaning that the right node would be Root + N + 1.
You can now recursively solve this problem by establishing which Node you are currently on, and setting the left and right now respectively.
So lets think about it,
You have the root node, left node, and right node established. At this point you have used 3 letters/numbers (it really doesnt matter if it is unordered). The next step would be to move down one level and start filling the left, you have the root, you need left and right nodes. Then move to the right node, do the left and right node of this and so on and so forth.
Think about that for a little bit and see where you get.
Cheers,
Mike
EDIT:
To search,
Searching a binary tree is also a recursive theme. (I thought you previously said the tree was unordered, which may change how the tree is laid out if it is suppose to be order).
If it is unordered, you can simply recurse the tree in a manner such that
A.) Check root node
B.) Check left node
C.) Continue checking left nodes until either there is a match, or no more left nodes to check
D.) Recurse back 1, check right node
E.) Check left nodes,
F.) Recuse back, check right node
This theme will continue until eventually you have checked ALL left nodes first, and then the right nodes. The KEY to this, is at any point you have a root node, go left first, then right. (I forget what traversal type this is, but there are others if you wish to implement them over this, i personally think this is the easiest to remember).
You will then repeat for right child of Root node.
If at any time you get a match, exit.
Remember this is recursive, so make sure you think your way through this step by step. It is recursive by definition, in that you will always do steps x,y,z for each part of the tree.
To beat a dead horse, lets look at just 3 nodes to start.
(simplified)
First the root,
if(root == (what your looking for))
{
return root
}
else if(root.leftNode == (what your looking for))
{
return root.leftNode
}
else if(root.rightNode == (what your looking for))
{
return root.rightNode
}
else
{
System.out.println("Value not found")
}
If you have 5 nodes, that would be root would have a left and right, and the root.leftNode would have a left and right... You would repeat the steps above on root.leftNode also, then search root.rightNode
If you have 7 nodes, you would search ALL of root.leftNode and then recurse back to search root.leftNode.
I hope this helps,
pictures work much better in my opinion when talking about traversing trees.
Perhaps look here for a better visual
http://www.newthinktank.com/2013/03/binary-tree-in-java/

Graph traversal - finding and returning the shortest distance

I am using a Breadth first search in a program that is trying to find and return the shortest path between two nodes on an unweighted digraph.
My program works like the wikipedia page psuedo code
The algorithm uses a queue data structure to store intermediate results as it traverses the graph, as follows:
Enqueue the root node
Dequeue a node and examine it
If the element sought is found in this node, quit the search and return a result.
Otherwise enqueue any successors (the direct child nodes) that have not yet been discovered.
If the queue is empty, every node on the graph has been examined – quit the search and return "not found".
If the queue is not empty, repeat from Step 2.
So I have been thinking of how to track number of steps made but I am having trouble with the limitations of java (I am not very knowledgeable of how java works). I originally was thinking that I could create some queue made up of a data type I made that stores steps and nodes, and as it traverses the graph it keeps track of the steps. If ever the goal is reached just simply return the steps.
I don't know how to make this work in java so I had to get rid of that idea and I moved on to using that wonky Queue = new LinkedList implementation of a queue. So basically I think it is a normal integer queue, I couldn't get my data type I made to work with it.
So now I have to find a more basic approach so I tried to use a simple counter, this doesn't work because the traversal algorithm searches down many paths before reaching the shortest one so I had an idea. I added a second queue that tracked steps, and I added a couple counters. Any time a node is added to the first queue I add to the counter, meaning I know that I am inspecting new nodes so I am not a distance further away. Once all those have been inspected I can then increase the step counter and any time a node is added to the first queue I add the step value to the step queue. The step queue is managed just like the node queue so that when the goal node is found the corresponding step should be the one to be dequeued out.
This doesn't work though and I was having a lot of problems with it, I am actually not sure why.
I deleted most of my code in panic and frustration but I will start to try and recreate it and post it here if anyone needs me to.
Were any of my ideas close and how can I make them work? I am sure there is a standard and simple way of doing this as well that I am not clever enough to see.
Code would help. What data structure are you using to store the partial or candidate solutions? You say your using a queue to store nodes to be examined, but really the objects stored in the queue should wrap some structure (e.g. List) that indicates the nodes traversed to get to the node to be examined. So, instead of simple Nodes being stored in the queue, some more complex object would be needed to make available the information necessary to know the complete path taken to that point. A simple node would only have information about itself, and it's children. But if you're examining node X, you also need to know how you arrived to node X. Just knowing node X isn't enough, and the only way (I know of) to know the path taken to node X is to store the path in the object that represents a "partial solution" or "candidate solution". If this is done, then finding the length of the path is trivial, because it's just the length of this list (or whichever data structure chosen). Hope I'm making some sense here. If not, post code and I'll take a look.
EDIT
These bits of code help show what I mean (they're by no means complete):
public class Solution {
List<Node> path;
}
Queue<Solution> q;
NOT
Queue<Node> q;
EDIT 2
If all you need is the length of the path, and not the path, per se, then try something like this:
public class Solution {
Node node; // whatever represents a node in you algorithm.
int len; // the length of the path to this node.
}
// Your queue:
LinkedList<Solution> q;
With this, before enqueuing a candidate solution (node), you do something like:
Solution sol = new Solution();
sol.node = childNodeToEnqueue;
sol.len = parentNode.len + 1;
q.add(sol);
The easiest solution in order to track distance during a traversal is to add a simple array (or a map if you vertices are not indexed by integers).
Here is pseudo code algorithm:
shortest_path(g, src, dst):
q = new empty queue
distances = int array of length order of g
for i = 0 to order: distances[i] = -1
distances[src] = 0
enqueue src in q
while q is not empty:
cur = pop next element in q
if cur is dst: return distances[dst]
foreach s in successors of cur in g:
if distances[s] == -1:
distances[s] = distances[cur] + 1
enqueue s in q
return not found
Note: order of a graph is the number of vertices
You don't need special data structures, the queue can just contains vertices' id (probably integers). In Java, LinkedList class implements the Queue interface, so it's a good candidate for your queue. For the distances array, if your vertices are identified by integers an integer array is enough, otherwise you need a kind of map.
You can also separate the vertex tainting (the -1 in my algo) using a separate boolean array or a set, but it's not really necessary and will waste some space.
If you want the path, you can also do that with a simple parent array: for each vertex you store its parent in the traversal, just add parent[s] = cur when you enqueue the successor. Then retrieving the path (in reverse order) is a simple like this:
path = new empty stack
cur = dst
while cur != src:
push cur in path
cur = parent[cur]
push src in path
And there you are …

how is the linked list actually being changed

i was wondering if anyone could help me with this question. i believe i understand the code and logic for the most part. i can trace code and it makes sense, but one thing i don't get is....how does the LinkedListNode previous actually change the LinkedListNode n that is passed in?
it seems to me that the function loops through n, and if the element is not yet found puts it into a hashtable. but when it is found again, it uses this newly created LinkedListNode previous to skip over the duplicate and link to the following element, which is n.next.
How does that actually disconnect LinkedListNode n? It seems like previous would be the LinkedListNode that has no duplicates, but since nothing gets returned in this function, n must be the one that changes. I guess I'm not seeing how n actually gets changed.
Clear and thorough help would be much appreciated. Thank you = )
public static void deleteDups(LinkedListNode n){
Hashtable table = new Hashtable();
LinkedListNode previous = null;
while(n != null){
if(table.containsKey(n.data))
previous.next = n.next
else{
table.put(n.data, true);
previous = n;
}
n = n.next;
}
}
doesn't the line...
LinkedListNode previous = null;
create a new LinkedListNode?
so this is my logic of how i'm doing it...
lets say the argument, n, gets passed in as
5 -> 6 -> 5 -> 7
when the code first runs, previous is null. it goes into the else statement, and previous is now 5? and then the line n = n.next makes n 6? now the hashtable has 5, and it loops again and goes into the else. prev is now 6 and the hastable has 6. then n becomes 5. it loops again but this time it goes into the if, and prev is now 7. and n will become 7. i see that prev skipped over 5, but ...how is prev unlinking n? it seems like prev is the LinkedListNode that contains no duplicates
how does the LinkedListNode previous actually change the
LinkedListNode n that is passed in?
Look at the line
n = n.next;
This line causes the passed node n to change - effectively moving it one node forward with each iteration.
it uses this newly created LinkedListNode previous to skip over the duplicate and link to the following element
No, no node is newly created here. The node previous always points to a node in the existing LinkedList of which the passed node n is one of the nodes. ( may be the starting node ). What makes you think it is newly created ?
It looks like you are confused in your understanding of how nodes and references ( and so a LinkedList as a whole ) works in Java. Because all modifications occur on the Node's data , and not the reference itself , ( ugh.. thats not entirely correct ), the original LinkedList passed to the method does get modified indeed after the method returns. You will need to analyse the LinkedList structure and workings in details to understand how this works. I suggest first get clarity about what pass by value and pass by references are in Java.
edit :
Your analysis of the run is correct , however your confusion still remains because you are not conceptually clear on certain things.
Towards the end of your analysis , you ask "..how is prev unlinking n? it seems like prev is the LinkedListNode that contains no duplicates "
This is a mess - first , you need to differentiate between a LinkedListNode and a LinkedList itself. prev and n are two instances of LinkedListNode, not LinkedList itself. In your example , the LinkedList is unnamed ( we dont have a name to refer to it ). This is the original list - there is no other list.
Second , in your illustration , the numbers you show are only one part of the node , called the node data. The other part, that you have missed out, is the next LinkedListNode reference that is implicit in every node. The link that you draw -> is actually the next reference in each node.When you say that prev skips 5 , what actually happens is that the next of node with data 6 is made to point to node with data 7.
At start :
5|next -> 6|next -> 5|next -> 7|next->NULL
After 5 is skipped :
5|next -> 6|next -> 7|next->NULL
As you see , the linkedlist is changed ! It does not matter if it was changed using prev or n, the change remains in the list.
if(table.containsKey(n.data))
previous.next = n.next
Is the section which does the deletion. It assigns the reference of the prior node's next field to the node after the current node, in effect unlinking the current node.

extra space for recursive depth-first search to store paths

I am using depth-first search to identify paths in a directed weighted graph, while revisiting nodes that belong to a cycle, and setting cutoff conditions based on total distance traveled, or stops from the source node.
As I understand, with recursion an explicit stack structure is not required for depth first search, so I was wondering if I could further simplify my code below by somehow doing without the explicit stack:
public class DFSonWeightedDirectedGraph {
private static final String START = "A";
private static final String END = "E";
private int pathLength = 0;
private int stops = 0;
public static void main(String[] args) {
//this is a directed weighted graph
WeightedDirectedGraph graph = new WeightedDirectedGraph();
graph.addEdge("A", "B", 15);
graph.addEdge("A", "D", 15);
graph.addEdge("A", "E", 27);
//(...) more edges added
Stack<String> visited = new Stack<String>();
visited.push(START);
new DFSonWeightedDirectedGraph().depthFirst(graph, visited);
}
private void depthFirst(WeightedDirectedGraph graph, Stack<String> visited) {
Collection<Map.Entry<String, Integer>> tree_of_children
= graph.get_tree_of_children(visited.peek());
for (Map.Entry<String, Integer> child : tree_of_children) {
if(pathLength + child.getValue()>= 20){
continue;
}
visited.push(child.getKey());
pathLength += child.getValue();
stops += 1;
if (child.getKey().equals(END)) {
printPath(visited);
}
depthFirst(graph, visited);
visited.pop();
pathLength -= child.getValue();
stops -= 1;
}
}
private void printPath(Stack<String> visited) {
for (String node : visited) {
System.out.print(node);
System.out.print(" ");
}
System.out.println("[path length: "+pathLength +
" stops made: " + stops +"]");
}
}
However, other recursive implementations without an explicit stack structure usually take into account already visited nodes, by coloring them white, gray or black. So, in my case where revisiting is allowed, and the path needs to be recorded, is an explicit stack absolutely required? Thanks for any suggestions of simpler alternatives.
If you have to save the path, you need a data structure for this. Your stack is OK; you could replace it with another data structure, but not get rid of it.
If it would be OK to directly print the path (and not record it), you do not need a stack. Then you can change the method signature to just get the graph and the actual node (and perhaps the actual path length and the "stops").
Just add an extra field to the node structure, that is a "visited" field. This will be fastest. You do have to unmark all nodes afterwards (or before you do the search).
Or, just hash the id of the node in a hashtable. This will be faster to check than a stack. If you don't have an id for the node, it is a good idea to create one, to help with debugging, output, etc.
You do need extra space, but adding a boolean field to each node will require the least space, since it will be 1 bit per node, vs. 1 pointer per node for a stack.
You don't really need a distance cut-off, since you are searching a finite graph and you only visit each node once, so you will visit at most N nodes in an N-node graph. You would need a depth cutoff if you were searching an infinite space, such as when doing a state-space search (an example is a prolog interpreter searching for a proof).
You don't need the visited nodes. Just pass your current child node to the recursive method instead of the visited nodes parameter, and use the return value for carrying the path.
If you can process the path element by element, i.e. rewrite printPath() so that it can be called once per element just the key type is required as return type. If you want to receive the whole path you need a list of key values as return type.
Actually, you're relatively close to the solution. Just use the call stack of the recursive method calls to represent the path.
Edit: This answer is completely off topic and was posted based on misinterpreting the question.
There are several things wrong with your DFS implementation. Yes, it visits all nodes in a depth-first manner and it does eventually manage to find a path between START and END, but it does not attempt to check for already visited nodes and keeps a stack for no real reason. The only reason you don't fall into infinite recursion on cycles is because you limit the maximum path length, and you will still take a long time on graphs that have multiple distinct paths between all pairs of vertices.
The only thing you are using the stack for is to pass the node to be visited next to the dfs function. You can simply get rid of the stack and pass the node directly.
So, instead of
private void depthFirst(WeightedDirectedGraph graph, Stack<String> visited) {
...
visited.push(child);
...
depthFirst(graph, visited);
You can simply write this as
private void depthFirst(WeightedDirectedGraph graph, String node) {
...
//visited.push(child); <-- No longer needed
...
depthFirst(graph, child);
You are using a data structure (stack) that you have named 'visited' and yet you do not use that to store/mark which nodes have been already visited to avoid revisiting.
You can modify your existing code to have a Set called visited (make it a global/class variable or pass it along recursive calls as you did with your stack) where you keep all nodes already visited and only call depthFirst() on those nodes that are not already in that Set.
That should make your code look something like this
private void depthFirst(WeightedDirectedGraph graph, String node, Set<String> visited) {
visited.add(node); // mark current node as visited
...
//visited.push(child); <-- No longer needed
...
if (!visited.contains(child)){ // don't visit nodes we have worked on already
depthFirst(graph, child);
}
So far my answer has been to try to modify your code to make it work. But it appears to me that you need to get a better grasp of what a DFS actually is and how it really works. Reading up the relevant chapter on any good Algorithm/Graph Theory book would help you greatly. I would recommend CLRS (it has a very nice chapter on simple graph traversals), but any good book should do. A simple and correct recursive DFS can be implemented in a much simpler manner using arrays, without having to resort to stacks or sets.
Edit:
I did not mention how you could retrieve the path after replacing the stack. This can be easily done by using a Map that stores the parent of each node as it is explored. The path (if any is found) can be obtained using a recursive printPath(String node) function, that prints the node passed to it and calls itself again on its parent.

Categories