Get all possible links between two strings [duplicate]

Get all possible links between two strings [duplicate] - java

I am working on an implementation of Dijkstra's Algorithm to retrieve the shortest path between interconnected nodes on a network of routes. I have the implementation working. It returns all the shortest paths to all the nodes when I pass the start node into the algorithm.
My question:
How does one go about retrieving all possible paths from Node A to, say, Node G or even all possible paths from Node A and back to Node A?

Finding all possible paths is a hard problem, since there are exponential number of simple paths. Even finding the kth shortest path [or longest path] are NP-Hard.
One possible solution to find all paths [or all paths up to a certain length] from s to t is BFS, without keeping a visited set, or for the weighted version - you might want to use uniform cost search
Note that also in every graph which has cycles [it is not a DAG] there might be infinite number of paths between s to t.

I've implemented a version where it basically finds all possible paths from one node to the other, but it doesn't count any possible 'cycles' (the graph I'm using is cyclical). So basically, no one node will appear twice within the same path. And if the graph were acyclical, then I suppose you could say it seems to find all the possible paths between the two nodes. It seems to be working just fine, and for my graph size of ~150, it runs almost instantly on my machine, though I'm sure the running time must be something like exponential and so it'll start to get slow quickly as the graph gets bigger.
Here is some Java code that demonstrates what I'd implemented. I'm sure there must be more efficient or elegant ways to do it as well.
Stack connectionPath = new Stack();
List<Stack> connectionPaths = new ArrayList<>();
// Push to connectionsPath the object that would be passed as the parameter 'node' into the method below
void findAllPaths(Object node, Object targetNode) {
for (Object nextNode : nextNodes(node)) {
if (nextNode.equals(targetNode)) {
Stack temp = new Stack();
for (Object node1 : connectionPath)
temp.add(node1);
connectionPaths.add(temp);
} else if (!connectionPath.contains(nextNode)) {
connectionPath.push(nextNode);
findAllPaths(nextNode, targetNode);
connectionPath.pop();
}
}
}

I'm gonna give you a (somewhat small) version (although comprehensible, I think) of a scientific proof that you cannot do this under a feasible amount of time.
What I'm gonna prove is that the time complexity to enumerate all simple paths between two selected and distinct nodes (say, s and t) in an arbitrary graph G is not polynomial. Notice that, as we only care about the amount of paths between these nodes, the edge costs are unimportant.
Sure that, if the graph has some well selected properties, this can be easy. I'm considering the general case though.
Suppose that we have a polynomial algorithm that lists all simple paths between s and t.
If G is connected, the list is nonempty. If G is not and s and t are in different components, it's really easy to list all paths between them, because there are none! If they are in the same component, we can pretend that the whole graph consists only of that component. So let's assume G is indeed connected.
The number of listed paths must then be polynomial, otherwise the algorithm couldn't return me them all. If it enumerates all of them, it must give me the longest one, so it is in there. Having the list of paths, a simple procedure may be applied to point me which is this longest path.
We can show (although I can't think of a cohesive way to say it) that this longest path has to traverse all vertices of G. Thus, we have just found a Hamiltonian Path with a polynomial procedure! But this is a well known NP-hard problem.
We can then conclude that this polynomial algorithm we thought we had is very unlikely to exist, unless P = NP.

The following functions (modified BFS with a recursive path-finding function between two nodes) will do the job for an acyclic graph:
from collections import defaultdict
# modified BFS
def find_all_parents(G, s):
Q = [s]
parents = defaultdict(set)
while len(Q) != 0:
v = Q[0]
Q.pop(0)
for w in G.get(v, []):
parents[w].add(v)
Q.append(w)
return parents
# recursive path-finding function (assumes that there exists a path in G from a to b)
def find_all_paths(parents, a, b):
return [a] if a == b else [y + b for x in list(parents[b]) for y in find_all_paths(parents, a, x)]
For example, with the following graph (DAG) G given by
G = {'A':['B','C'], 'B':['D'], 'C':['D', 'F'], 'D':['E', 'F'], 'E':['F']}
if we want to find all paths between the nodes 'A' and 'F' (using the above-defined functions as find_all_paths(find_all_parents(G, 'A'), 'A', 'F')), it will return the following paths:

Here is an algorithm finding and printing all paths from s to t using modification of DFS. Also dynamic programming can be used to find the count of all possible paths. The pseudo code will look like this:
AllPaths(G(V,E),s,t)
C[1...n] //array of integers for storing path count from 's' to i
TopologicallySort(G(V,E)) //here suppose 's' is at i0 and 't' is at i1 index
for i<-0 to n
if i<i0
C[i]<-0 //there is no path from vertex ordered on the left from 's' after the topological sort
if i==i0
C[i]<-1
for j<-0 to Adj(i)
C[i]<- C[i]+C[j]
return C[i1]

If you actually care about ordering your paths from shortest path to longest path then it would be far better to use a modified A* or Dijkstra Algorithm. With a slight modification the algorithm will return as many of the possible paths as you want in order of shortest path first. So if what you really want are all possible paths ordered from shortest to longest then this is the way to go.
If you want an A* based implementation capable of returning all paths ordered from the shortest to the longest, the following will accomplish that. It has several advantages. First off it is efficient at sorting from shortest to longest. Also it computes each additional path only when needed, so if you stop early because you dont need every single path you save some processing time. It also reuses data for subsequent paths each time it calculates the next path so it is more efficient. Finally if you find some desired path you can abort early saving some computation time. Overall this should be the most efficient algorithm if you care about sorting by path length.
import java.util.*;
public class AstarSearch {
private final Map<Integer, Set<Neighbor>> adjacency;
private final int destination;
private final NavigableSet<Step> pending = new TreeSet<>();
public AstarSearch(Map<Integer, Set<Neighbor>> adjacency, int source, int destination) {
this.adjacency = adjacency;
this.destination = destination;
this.pending.add(new Step(source, null, 0));
}
public List<Integer> nextShortestPath() {
Step current = this.pending.pollFirst();
while( current != null) {
if( current.getId() == this.destination )
return current.generatePath();
for (Neighbor neighbor : this.adjacency.get(current.id)) {
if(!current.seen(neighbor.getId())) {
final Step nextStep = new Step(neighbor.getId(), current, current.cost + neighbor.cost + predictCost(neighbor.id, this.destination));
this.pending.add(nextStep);
}
}
current = this.pending.pollFirst();
}
return null;
}
protected int predictCost(int source, int destination) {
return 0; //Behaves identical to Dijkstra's algorithm, override to make it A*
}
private static class Step implements Comparable<Step> {
final int id;
final Step parent;
final int cost;
public Step(int id, Step parent, int cost) {
this.id = id;
this.parent = parent;
this.cost = cost;
}
public int getId() {
return id;
}
public Step getParent() {
return parent;
}
public int getCost() {
return cost;
}
public boolean seen(int node) {
if(this.id == node)
return true;
else if(parent == null)
return false;
else
return this.parent.seen(node);
}
public List<Integer> generatePath() {
final List<Integer> path;
if(this.parent != null)
path = this.parent.generatePath();
else
path = new ArrayList<>();
path.add(this.id);
return path;
}
#Override
public int compareTo(Step step) {
if(step == null)
return 1;
if( this.cost != step.cost)
return Integer.compare(this.cost, step.cost);
if( this.id != step.id )
return Integer.compare(this.id, step.id);
if( this.parent != null )
this.parent.compareTo(step.parent);
if(step.parent == null)
return 0;
return -1;
}
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Step step = (Step) o;
return id == step.id &&
cost == step.cost &&
Objects.equals(parent, step.parent);
}
#Override
public int hashCode() {
return Objects.hash(id, parent, cost);
}
}
/*******************************************************
* Everything below here just sets up your adjacency *
* It will just be helpful for you to be able to test *
* It isnt part of the actual A* search algorithm *
********************************************************/
private static class Neighbor {
final int id;
final int cost;
public Neighbor(int id, int cost) {
this.id = id;
this.cost = cost;
}
public int getId() {
return id;
}
public int getCost() {
return cost;
}
}
public static void main(String[] args) {
final Map<Integer, Set<Neighbor>> adjacency = createAdjacency();
final AstarSearch search = new AstarSearch(adjacency, 1, 4);
System.out.println("printing all paths from shortest to longest...");
List<Integer> path = search.nextShortestPath();
while(path != null) {
System.out.println(path);
path = search.nextShortestPath();
}
}
private static Map<Integer, Set<Neighbor>> createAdjacency() {
final Map<Integer, Set<Neighbor>> adjacency = new HashMap<>();
//This sets up the adjacencies. In this case all adjacencies have a cost of 1, but they dont need to.
addAdjacency(adjacency, 1,2,1,5,1); //{1 | 2,5}
addAdjacency(adjacency, 2,1,1,3,1,4,1,5,1); //{2 | 1,3,4,5}
addAdjacency(adjacency, 3,2,1,5,1); //{3 | 2,5}
addAdjacency(adjacency, 4,2,1); //{4 | 2}
addAdjacency(adjacency, 5,1,1,2,1,3,1); //{5 | 1,2,3}
return Collections.unmodifiableMap(adjacency);
}
private static void addAdjacency(Map<Integer, Set<Neighbor>> adjacency, int source, Integer... dests) {
if( dests.length % 2 != 0)
throw new IllegalArgumentException("dests must have an equal number of arguments, each pair is the id and cost for that traversal");
final Set<Neighbor> destinations = new HashSet<>();
for(int i = 0; i < dests.length; i+=2)
destinations.add(new Neighbor(dests[i], dests[i+1]));
adjacency.put(source, Collections.unmodifiableSet(destinations));
}
}
The output from the above code is the following:
[1, 2, 4]
[1, 5, 2, 4]
[1, 5, 3, 2, 4]
Notice that each time you call nextShortestPath() it generates the next shortest path for you on demand. It only calculates the extra steps needed and doesnt traverse any old paths twice. Moreover if you decide you dont need all the paths and end execution early you've saved yourself considerable computation time. You only compute up to the number of paths you need and no more.
Finally it should be noted that the A* and Dijkstra algorithms do have some minor limitations, though I dont think it would effect you. Namely it will not work right on a graph that has negative weights.
Here is a link to JDoodle where you can run the code yourself in the browser and see it working. You can also change around the graph to show it works on other graphs as well: http://jdoodle.com/a/ukx

find_paths[s, t, d, k]
This question is now a bit old... but I'll throw my hat into the ring.
I personally find an algorithm of the form find_paths[s, t, d, k] useful, where:
s is the starting node
t is the target node
d is the maximum depth to search
k is the number of paths to find
Using your programming language's form of infinity for d and k will give you all paths§.
§ obviously if you are using a directed graph and you want all undirected paths between s and t you will have to run this both ways:
find_paths[s, t, d, k] <join> find_paths[t, s, d, k]
Helper Function
I personally like recursion, although it can difficult some times, anyway first lets define our helper function:
def find_paths_recursion(graph, current, goal, current_depth, max_depth, num_paths, current_path, paths_found)
current_path.append(current)
if current_depth > max_depth:
return
if current == goal:
if len(paths_found) <= number_of_paths_to_find:
paths_found.append(copy(current_path))
current_path.pop()
return
else:
for successor in graph[current]:
self.find_paths_recursion(graph, successor, goal, current_depth + 1, max_depth, num_paths, current_path, paths_found)
current_path.pop()
Main Function
With that out of the way, the core function is trivial:
def find_paths[s, t, d, k]:
paths_found = [] # PASSING THIS BY REFERENCE
find_paths_recursion(s, t, 0, d, k, [], paths_found)
First, lets notice a few thing:
the above pseudo-code is a mash-up of languages - but most strongly resembling python (since I was just coding in it). A strict copy-paste will not work.
[] is an uninitialized list, replace this with the equivalent for your programming language of choice
paths_found is passed by reference. It is clear that the recursion function doesn't return anything. Handle this appropriately.
here graph is assuming some form of hashed structure. There are a plethora of ways to implement a graph. Either way, graph[vertex] gets you a list of adjacent vertices in a directed graph - adjust accordingly.
this assumes you have pre-processed to remove "buckles" (self-loops), cycles and multi-edges

You usually don't want to, because there is an exponential number of them in nontrivial graphs; if you really want to get all (simple) paths, or all (simple) cycles, you just find one (by walking the graph), then backtrack to another.

I think what you want is some form of the Ford–Fulkerson algorithm which is based on BFS. Its used to calculate the max flow of a network, by finding all augmenting paths between two nodes.
http://en.wikipedia.org/wiki/Ford%E2%80%93Fulkerson_algorithm

There's a nice article which may answer your question /only it prints the paths instead of collecting them/.
Please note that you can experiment with the C++/Python samples in the online IDE.
http://www.geeksforgeeks.org/find-paths-given-source-destination/

I suppose you want to find 'simple' paths (a path is simple if no node appears in it more than once, except maybe the 1st and the last one).
Since the problem is NP-hard, you might want to do a variant of depth-first search.
Basically, generate all possible paths from A and check whether they end up in G.

Related

Java Nodes how to store nodes as pairs of data

I want to make a program where you can make nodes and connect them, I wanna make Graphical interface so you can see the graphs. My problem is I need to store the edges in between nodes(vertices) some people at my university told me to use maps?
I've tried to store the nodes with a map structure and making my own pair class:
import java.util.Stack;
public class Node extends Main {
public String data;
int ID;
Boolean IsConnected = false;
public Node(String data, int ID, int Connection ) {
this.data = data;
this.ID = ID;
}
public void Connect(Node N1, Node N2) {
}
public boolean IsConnected(Node N1, Node N2) {
if (map.containsValue(N1) && map.containsKey(N2)) {
System.out.println("connected");
return true;
}
System.out.println("not connected");
return false;
}
}

This addresses one way to logically represent edges of a graph. It does not address how to represent a graph visually.
I'm going to assume the number of nodes needed will be known. If that is not the case, this answer will need to be changed.
A common way to represent edges of a graph is to use an adjacency matrix. Given n Nodes, the simplest adjacency matrix for a directed graph is an 2D array of boolean:
boolean [][] adjacent = new boolean [n][n];
This requires each Node to be associated with an index. One way to do that is to use an array:
Node [] myNodes = new node [n];
Then, finding out if there is an edge between two nodes is simple:
public boolean areConnected (int a, int b) {
return adjacent [a][b];
}
And, they are easy to connect:
public void Connect (int from, int to) {
adjacent [from][to] = true;
}
If you need to search to find the index of a Node, you can add this code:
public int indexOf (Node node) {
int index = -1;
for (int i = 0; i < myNodes.length; ++i) {
if (node.equals (myNode [i]) {
index = i;
break;
}
}
}
public boolean areConnected (Node a, Node b) {
int locA = indexOf (a);
int locB = indexOf (b);
if (locA < 0 || locB < 0) {
return false;
}
return areConnected (locA, locB);
}
NOTE: Please seriously consider overriding equals (Object other) and hashCode() methods. Whether you use an adjacency matrix as described here, or a Collection, the default .equals (Object other) will be equivalent to this == other. That is, it will test if they refer to the same Object. If you want to know if the contents of Node Objects are the same, you need to override equals. If you override equals, you should also override hashCode().
Note: You will want to decide how to guard against the possibility that an invalid index or node will be passed to areConnected or Connect. You might, for example, want to throw an exception.
Note: This assumes a directed graph.
Comment: The beginning of this answer has this statement: "I'm going to assume the number of nodes needed will be known early." "Early" means "before the matrix is constructed". One way to represent an adjacency matrix in which the number of nodes is flexible is to use a List<List<Boolean>>. That can represent a 2D List.
An adjacency matrix is not restricted to boolean. For example, if each Node represented a city, the adjacency matrix might be an int [][] , where each entry represents the number of highway miles between cities. A double might represent the geographic distance. A BigDecimal [][] could represent the price of airfare between the cities. It could be an array of Objects of a class you create.

How to implement Dijkstra's Algorithm to find the shortest path in Java

I am absolutely confused on what to do. I'm trying to code off of the pseudo code that wikipedia has on Dijkstra's with priority queues, but I'm having a tough time making the adjustments to fit what i need to find. This is my (incomplete) code so far, and any help would be very much appreciated.
public int doDijkstras (String startVertexName, String endVertexName, ArrayList< String > shortestPath) {
PriorityQueue<QEntry> q = new PriorityQueue<QEntry>();
int cost = 0;
int newCost;
QEntry pred = null;
for (String s : this.getVertices()) {
if (!s.equals(startVertexName)) {
cost = Integer.MAX_VALUE;
pred = null;
}
q.add(new QEntry(s, cost, pred, adjacencyMap.get(s)));
}
while (!q.isEmpty()) {
QEntry curr = getMin(q);
for (String s : curr.adj.keySet()) {
newCost = curr.cost + this.getCost(curr.name, s);
QEntry v = this.getVert(q, s);
if (newCost < v.cost) {
v.cost = newCost;
v.pred = curr;
if (!q.contains(curr)) {
q.add(curr);
}
}
}
}
}
private QEntry getMin(PriorityQueue<QEntry> q) {
QEntry min = q.peek();
for (QEntry temp : q) {
if (min.cost > temp.cost) {
min = temp;
}
}
return min;
}
private QEntry getVert(PriorityQueue<QEntry> q, String s) {
for (QEntry temp : q) {
if (temp.name.equals(s)) {
return temp;
}
}
return null;
}
class QEntry {
String name;
int cost;
QEntry pred;
TreeMap<String, Integer> adj;
public QEntry(String name, int cost, QEntry pred, TreeMap<String, Integer> adj) {
this.name = name;
this.cost = cost;
this.adj = adj;
this.pred = pred;
}
}

You are overlooking an important part of the algorithm: when to stop.
The pseudocode on Wikipedia is for the variation on Dijkstra's algorithm that computes the shortest path from the start node to every node connected to it. Commentary immediately following the big pseudocode block explains how to modify the algorithm to find only the path to a specific target, and after that is a shorter block explaining how to extract paths.
In English, though, as you're processing your priority queue, you need to watch for the target element being the one selected. When (if ever) it is, you know that no shorter path to it can be discovered than the one having the cost recorded in the target's queue entry, and represented (in reverse order) by that entry and its chain of predecessors. You fill the path list by walking the chain of predecessors, and you return the value that was recorded in the target queue entry.
Note, however, that in your code, in the event that the start and target vertexes are not connected in the graph (including if the target is not in the graph at all), you will eventually drain the queue and fall out the bottom of the while loop without ever reaching the target. You have to decide what to do with the path list and what to return in that case.
Note, too, that your code appears to have several errors, among them:
In the event that the start vertex name is not the first one in the iteration order of this.getVertices(), its queue entry will not be initialized with cost 0, and will not likely be the first element chosen from the queue.
If the specified start vertex is not in the graph at all then your code will run, and may emit a path, but its output in that case is bogus.
Your queue elements (type QEntry) do not have a natural order; to create a PriorityQueue whose elements have such a type, you must provide a Comparator that defines their relative priorities.
You are using your priority queue as a plain list. That in itself will not make your code produce wrong results, but it does increase its asymptotic complexity.
Be aware, however, that if you use the standard PriorityQueue as a priority queue, then you must never modify an enqueued object in a way that could change its order relative to any other enqueued object; instead, remove it from the queue first, modify it, then enqueue it again.

Heuristic for A*-Algorithm with irregular distances between nodes

I am currently working on an implementation of the A* Algorithm with irregular distances between two nodes. The graph containing the nodes is a directed and weighted graph. Every node is connected to at least one other node, there may also be symmetrical connections with different distances. A node is nothing more than a label and doesn't contain any special information
What I need is a heuristic to determine the shortest path from any node A to another node B as accurate as possible. I tried to use a heuristic that returns the distance to the nearest neighbor of a node, but of course that wasn't as effective as no heuristic at all (= Dijkstra).
My implementation of the A* Algorithm consists mainly of 2 classes, the class for the algorithm itself (AStar) and one for the nodes (Node). The code is heavily based on the Wikipedia pseudocode.
Source code of AStar.java
public class AStar {
private AStar() {}
private static Node[] reconstructPath(Map<Node, Node> paths, Node current) {
List<Node> path = new ArrayList<Node>();
path.add(0, current);
while (paths.containsKey(current)) {
current = paths.get(current);
path.add(0, current);
}
return path.toArray(new Node[0]);
}
public static Node[] calculate(Node start, Node target, IHeuristic heuristic) {
List<Node> closed = new ArrayList<Node>();
PriorityQueue<Node> open = new PriorityQueue<Node>();
Map<Node, Double> g_score = new HashMap<Node, Double>();
Map<Node, Double> f_score = new HashMap<Node, Double>();
Map<Node, Node> paths = new HashMap<Node, Node>();
g_score.put(start, 0d);
f_score.put(start, g_score.get(start) + heuristic.estimateDistance(start, target));
open.set(start, f_score.get(start));
while (!open.isEmpty()) {
Node current = null;
// find the node with lowest f_score value
double min_f_score = Double.POSITIVE_INFINITY;
for (Entry<Node, Double> entry : f_score.entrySet()) {
if (!closed.contains(entry.getKey()) && entry.getValue() < min_f_score) {
min_f_score = entry.getValue();
current = entry.getKey();
}
}
if (current.equals(target)) return reconstructPath(paths, target);
open.remove(current);
closed.add(current);
for (Node neighbor : current.getAdjacentNodes()) {
if (closed.contains(neighbor)) {
continue;
}
double tentative_g_score = g_score.get(current) + current.getDistance(neighbor);
if (!open.contains(neighbor) || tentative_g_score < g_score.get(neighbor)) {
paths.put(neighbor, current);
g_score.put(neighbor, tentative_g_score);
f_score.put(neighbor, g_score.get(neighbor) + heuristic.estimateDistance(neighbor, target));
if (!open.contains(neighbor)) {
open.set(neighbor, f_score.get(neighbor));
}
}
}
}
throw new RuntimeException("no path between " + start + " and " + target);
}
}
Source code of Node.java
public class Node {
private Map<Node, Double> distances = new HashMap<Node, Double>();
public final String name;
public Node(String name) {
this.name = name;
}
public Set<Node> getAdjacentNodes() {
return Collections.unmodifiableSet(distances.keySet());
}
public double getDistance(Node node) {
return distances.get(node);
}
public void setDistance(Node node, double distance) {
distances.put(node, distance);
}
#Override
public String toString() {
return (name == null ? "Node#" + Integer.toHexString(hashCode()) : name);
}
}
Source code of PriorityQueue.java
public class PriorityQueue<T> {
transient ArrayList<PriorityEntry<T>> elements = null;
private static final int DEFAULT_SIZE = 10;
public PriorityQueue() {
elements = new ArrayList<PriorityEntry<T>>(DEFAULT_SIZE);
}
public PriorityQueue(int initialCapacity) {
elements = new ArrayList<PriorityEntry<T>>(initialCapacity);
}
public boolean push(T element, double priority) {
PriorityEntry<T> entry = new PriorityEntry<T>(element, priority);
if (elements.contains(entry)) return false;
elements.add(entry);
elements.sort(null);
return true;
}
public void set(T element, double priority) {
PriorityEntry<T> entry = new PriorityEntry<T>(element, priority);
int index = elements.indexOf(entry);
if (index >= 0) {
elements.get(index).setPriority(priority);
} else {
elements.add(entry);
}
elements.sort(null);
}
public T peek() {
return size() <= 0 ? null : elements.get(0).getValue();
}
public T pop() {
return size() <= 0 ? null : elements.remove(0).getValue();
}
public boolean remove(T element) {
return elements.remove(new PriorityEntry<T>(element, 0));
}
public int size() {
return elements.size();
}
public boolean isEmpty() {
return elements.isEmpty();
}
public boolean contains(T element) {
return elements.contains(new PriorityEntry<T>(element, 0));
}
private class PriorityEntry<E> implements Comparable<PriorityEntry<? extends T>> {
private final E value;
private double priority = Double.MIN_VALUE;
public PriorityEntry(E value, double priority) {
this.value = value;
this.priority = priority;
}
public E getValue() {
return value;
}
public double getPriority() {
return priority;
}
public void setPriority(double priority) {
this.priority = priority;
}
#Override
#SuppressWarnings("unchecked")
public boolean equals(Object o) {
if (!(o instanceof PriorityEntry)) return false;
PriorityEntry<?> entry = (PriorityEntry<?>) o;
return value.equals(entry);
}
#Override
public int compareTo(PriorityEntry<? extends T> entry) {
return (int) (getPriority() - entry.getPriority());
}
}
}

Before trying to define a heuristic function for your problem, consider that in many cases, using a poor (or incorrect) estimation of the cost to the goal is as self-defeating as not using an heuristic at all.
In the case of a graph with weighted arcs, you need to consider if there is some information in the nodes which can lead to obtain the heuristic values (for example, if your nodes are cities, a good estimation can be the lenght of the straight line between them; or if your nodes are arrays, a similarity measurement between them). If your nodes are only labels and there is no information that you can use to obtain your heuristic values, maybe the best solution is not using a heuristic at all. This is not the worst scenario for most problems of this type. It is better to use a Dijkstra search (which is the same of A* but using heuristic=0), letting the algorithm expand the nodes based on the cost from the start, than using a bad heuristic which is not consistent, because in this situation you might be expanding unncecesary nodes that seem to be promising based on a bad estimation of the cost to the goal.
I don't know how big are your graphs, but for most problems there is not a significant difference in computation time between using a heuristic and don't using it at all. Specially in the case of a bad heuristic.
I can see that you have your own implementation of A*. I recommend you to take a look of an heuristic search library like Hipster. This library allows you to define your graph and test different search algorithms to know the best one that fits your problem. Some code examples describe exactly your case: seach in weighted directed graphs. It might be useful for your problem.
I hope my answer helps. Regards,

Without going into other possible problems I would like to point out the main problem - you lack the plane. In case of shortest distance problem between cities, you have
node - city
weight - numerical value describing cost to get from city a to city b
plane - describes environment ex: city position (in your square grid)
From plane you can extrapolate meaningful heuristic. For example you can make assumption from city position like city with lowest arithmetical distance should be looked first.
If you do not have a plane you do not have any means to predict a meaningful heuristic. A* will still work, but it hardly differs from exhaustive search. You can create plane from the weight, but it .
Ex:
Iterate over weights and find the weight quantiles 20/60/20 - now you have a relative plane
- good weight threshold (lowest 20%)
- average weight threshold (middle 60%)
- bad weight threshold (highest 20%)
With priority queue your algorithm would pick good moves, then average and finally bad ones.
If you want you can have more then 3 segments.
Just as a reminder: A* returns good enough result fast. Using exhaustive search you can find the best solution, but it would become exponentially slower if problem size grows.

To add to #kiheru comment above. Your solution will only be as good as the heuristic provided.
If the following line and the heuristic.estimate, has too narrow of a scope. The algorithm will quickly reach a local minimum. Alternatively, if the heuristic isn't admissible the algorithm will result in either no solution or an incorrect random solution.
f_score.put(start, g_score.get(start) + heuristic.estimateDistance(start, target));
Take a close look at your heuristic and confirm it's admissible. If it is admissible, it may need to be improved in order to provide a more accurate estimate.

In the case of your node class it seems to have an X and Y if these represent the node's position in a 2D space maybe you could use a heuristic based on the straight line distance between the nodes calculated from the X and Y values.

Building a linked list as the basis of a graph Null pointer errors

I am trying to build a graph based on a linked list, where I build the linked list of nodes, and each node points to a linked list of edges. I build the graph based on an input file.
My input file will be on the following scheme:
Number of Nodes in graph
SourceNode1 EndNode1
SourceNode2 EndNode2
....
For example:
4 //Number of nodes
1 2 //An edge between 1 and 2
1 3 //An edge between 1 and 3
2 4 //An edge between 2 and 4
An assumption is that the nodes in the graph will numbered 1 through the number of nodes and that no node will have more than 1 "parent" (though a node might have more than 1 "child").
My problem is trying to build the linked list containing the nodes. Each node has 3 fields: the edges coming from that node, the node value (1, 2, 3, etc.), and the next node (because is a linked list of nodes). I attempt to parse in the number of nodes, create a first node manually, and attach the rest of the nodes in an in an iterative fashion.
Note: The parent field is for some external analysis unrelated to this question. You can ignore it.
Node class:
public class Node {
private Edge firstEdge;
private Node parent;
private Node nextNode;
private int element;
//Constructor
public Node() {
parent = null;
firstEdge = null;
nextNode = null;
}
//Accsessor and Modifier Methods
public void setElement(int e) {element = e;}
public Node getNextNode() {return nextNode;}
public Edge getFirstEdge() {return firstEdge;}
public void setFirstEdge(Edge a) {firstEdge = a;}
public void setNextNode(Node a) {nextNode = a;}
public int getElement() {return element;}
public Node getParent() {return parent;}
public void setParent(Node p) {parent = p;}
//Checks for a non-null parent
public boolean hasParent() { return parent == null; }
//checks iff node has next edge
public boolean hasFirstEdge() { return firstEdge == null; }
//checks if a node has a next node
public boolean hasNextNode() { return nextNode == null; }
}
Edge class:
public class Edge {
//Instance Variables
private Node nextNode;
private Edge nextEdge;
//Constructor
public Edge() {
nextNode = null;
nextEdge = null;
}
//Accsessor and Modifier Methods
public void setNextNode(Node a) {nextNode = a;}
public void setNextEdge(Edge a) {nextEdge = a;}
public Node getNextNode() {return nextNode;}
public Edge getNextEdge() {return nextEdge;}
public boolean hasNextEdge() {
return nextEdge == null;
}
}
Driver class:
import java.util.Scanner;
import java.io.*;
public class Driver {
public static void main(String[] args)throws FileNotFoundException{
//Get text file for building the graph
Scanner console = new Scanner(System.in);
System.out.print("Please enter the text file name: ");
String fileName = console.nextLine();
Scanner in = new Scanner(new File(fileName));
//in contains the file reading scanner
int numNodes = in.nextInt(); //first line of the text file
Node first = new Node(); //first is head of the list
first.setElement(1);
int i = 2; //counter
//Build the nodes list; I get problems in this loop
while (i <= numNodes) {
Node head = new Node(); //Tracker node
head = first; //head is the first node of the list
/*Loop to end of the list*/
while(head.hasNextNode()) {
//Null check; without it, I get NullPointerExceptions.
//If it is not needed, or there is a better way, please inform me.
if (head.getNextNode() == null) {
break;
}
head = head.getNextNode(); //get to the end of the ilst
}
//Next node to add
Node newNode = new Node();
newNode.setElement(i); //Because of the 1, 2, 3 nature of the graph
head.setNextNode(newNode); //Set the last element as the next node
i++;
}
//Manually check if graph is made (check if the nodes are linked correctly)
System.out.println("First elem (expect 1): " + first.getElement());
System.out.println("Second elem (expect 2): " + first.getNextNode().getElement()); //It prints 4 here for some reason
System.out.println("Third elem (expect 3): " + first.getNextNode().getNextNode().getElement()); //Getting a NullPointerException
System.out.println("Fourth elem (expect 4): " + first.getNextNode().getNextNode().getNextNode().getElement());
System.out.println("Expecting null: " + first.getNextNode().getNextNode().getNextNode().getNextNode().getElement());
}
When I'm checking if the graph is built, I get problems. I am manually checking it (for this small graph, its possible), and simply print out the first node and the value of the subsequent nodes. I am expecting 1, 2, 3, 4, and null (for the element past 4, because it does not exist). The first node is fine, it prints 1. Calling first.getNextNode().getElement() prints 4, for some odd reason. And calling the node after that gives a NullPointerException. Could someone help me solve this problem?
Note: I haven't added the edges yet. I am just trying to get the core of the linked list of nodes built.
This is my first post on stack overflow. I apologize if it is vague, ambigous, overly detailed, lacking in information, or is a duplicate question. I could not find the answer anywhere else. All input is welcome and appreciated.

Much of your naming is very confusing and is in serious need of clarifying refactorization. Edge's nextNode should be called destinationNode or something along those lines to make it clear you are dereferencing from an Edge object instead of from another Node.
Now, let's delve into the actual implementation.
Node head = new Node(); //Tracker node
head = first; //head is the first node of the list
What's going on here? It looks like you set your local variable head to be a brand new Node; that's great. Except the very next line, you discard it and set it to the value of your first variable.
Then you traverse all the way to the end of the list with a while loop, then create another new Node (this one you actually use). (Normally if you wanted to add something to the end of the list you should be utilizing a doubly linked list, or should at least have pointers to both the first and the last elements... i.e., first always stays the same, but when you add a new node you simply say newNode = new Node(); last.nextNode = newNode; last = newNode; and then configure the new element from there. The way you are doing it, it's taking O(N^2) time to construct a singly-linked list with N elements, hardly ideal.
I also have some preferential criticism about the construction of your minor classes... if you are allowing values to be freely get and set to any value with public setters and getters without taking any action whatsoever when they change, you get the exact same functionality from simply marking those fields public and doing away with the getters and setters entirely. If you have any plans to add more functionality in the future it's fine the way it is, but if they are just going to be dumb linked list elements whose actual uses are implemented elsewhere then you are better off treating the class more like a struct.
Here's a good way to build a singly-linked Node list the way you're looking to:
int numNodes = in.nextInt(); //first line of the text file
// sentinel value indicating the beginning of the list
Node header = new Node();
header.setElement(-1);
// last node in the list
Node last = first;
// this loop constructs a singly linked ring from the header
for (int i = 1; i <= numNodes; i++) {
Node newNode = new Node();
newNode.setElement(i);
newNode.setNextNode(header);
last.setNextNode(newNode);
last = newNode;
}
// do your debug outputs here
// for instance, this loop always outputs every node in the list:
for (Node n = header.getNextNode(); n != header; n = n.getNextNode()) {
System.out.println("Node " + n.getElement());
}
Note that the use of header as a sentinel value guarantees that for any Node that's already been built, getNextNode() will never return null.
Again, all this code can be made much more readable by making the fields in your Node and Edge classes public and scrapping the getters and setters. header.getNextNode().getNextNode().getNextNode() can become header.nextNode.nextNode.element and so forth.
Stage 2
Now that that's out of the way, we have the question of how useful this type of structure will actually be for your application. My biggest concern in looking at this is the fact that, when applying edges between nodes on your graph, you will need to access arbitrarily indexed Nodes to attach edges to them... and while every Node already knows what its element index is, getting the Nth node takes N steps because your entire set of nodes is in a linked list.
Remember, the main advantage of using a linked list is the ability to remove arbitrary elements in O(1) time as you step through the list. If you are only building a list and aren't going to ever remove anything from it, arrays are often faster -- especially if you ever need to access arbitrary elements.
What if you don't need to guarantee they're in any particular order or access them by their index, but you need to be able to add, access-by-ID, and remove them very quickly for larger data sets? HashSet may be the thing for you! What if you still need to be able to access them all in the order they were added? LinkedHashSet is your best friend. With this, you could easily even give the nodes names that are strings with no real slowdown.
As for the edges, I feel you are already doing fine: it's probably best to implement the outgoing edges for each Node in a singly linked list, assuming you will rarely be removing edges or will have a small number of edges per node and will always access them all together. To add a new edge, simply say newEdge = new Edge(); newEdge.nextEdge = firstEdge; firstEdge = newEdge; and you're done, having added the new edge to the beginning of the list. (Singly linked lists are easiest to use as stacks rather than queues.)
For extra fancy-points, implement Iterable<Edge> with your Node class and make a little Iterator class so you can use extended-for to visit every edge and make your life even easier!

As #Widdershins says, the terms used makes the algorithm hard to understand.
I would recommend two things in order to refactor your code:
Review the terminology (maybe this helps: http://en.wikipedia.org/wiki/Glossary_of_graph_theory). I know that it sounds like a silly recommendation, but using proper terms will help you a lot to review the object model.
Use a better representation. In your code a Node fills multiple roles, which makes the code hard to follow.
A good representation will depend a lot of the kind of problems that you try to resolve. For example an Adjacency List, or a Matrix are useful to apply some algorithms of graph theory.
But if you only want to exercise with an object oriented design, is useful to start with the basics.
Take the definition of a Graph in mathematics: G = (V, E)... a graph is a pair of a set of nodes and a set of edges between those nodes, and translate it to code:
(the example uses fields for brevity)
class DirectedGraph {
final Set<Node> nodes = new HashSet<Node>();
final Set<Edge> edges = new HashSet<Edge>();
}
Now you need to extend this definition. You can do it step by step. I did the same to end with this representation:
class DirectedGraph {
final Set<Node> nodes = new HashSet<Node>();
final Set<Edge> edges = new HashSet<Edge>();
public Node addNode(Object value) {
Node newNode = new Node(value);
nodes.add(newNode);
return newNode;
}
public Edge addEdge(Node src, Node dst) {
Edge newEdge = new Edge(src, dst);
edges.add(newEdge);
return newEdge;
}
private assertValidNode(Node n) {
if (n.graph != this)
throw new IllegalArgumentException("Node " + n + " not part of the graph");
}
public Set<Node> successorsOf(Node n) {
assertValidNode(n);
Set<Node> result = new HashSet<Node>();
for (Edge e : edges) {
if (e.src == n) { result.add(e.dst); }
}
return result;
}
class Node {
final graph = DirectedGraph.this;
final Object value;
Node(Object v) {
this.value = v;
}
public String toString() { return value.toString(); }
public Set<Node> successors() {
return graph.successorsOf(this);
}
// useful shortcut
public Node connectTo(Node... nodes) {
for (Node dst : nodes) {
graph.addEdge(this, dst);
}
return this;
}
}
class Edge {
final graph = DirectedGraph.this;
final Node src; final Node dst;
Edge(Node src, Node dst) {
graph.assertValidNode(src);
graph.assertValidNode(dst);
this.src = src; this.dst = dst;
}
public String toString() { return src.toString() + " -> " + dst.toString(); }
}
}
DirectedGraph g = new DirectedGraph();
DirectedGraph.Node one = g.addNode(1);
DirectedGraph.Node two = g.addNode(2);
DirectedGraph.Node three = g.addNode(3);
DirectedGraph.Node four = g.addNode(4);
one.connectTo(two, three)
two.connectTo(four);
System.out.println(g.edges);
System.out.println(one.successors());
System.out.println(two.successors());
This strategy of representing the domain model in a "1 to 1" mapping, always helped me to "discover" the object model. Then you can improve the implementation for your specific needs (i.e. the running time of successorsOf can be improved by using an adjacency list).
Note that in this representation a Node and an Edge can only exist as a part of a graph. This restriction is not deduced directly from the math representation... but helps to maintain the constraints of a proper graph.
Note You can extract the inner-classes by constructing the Node and Edge with a parent graph reference.

Returning only the vertices in the actual shortest path

I know the title is a bit messy, but I don't know how to explain it better.
What I'm trying to do:
Using a graph found in a text file, find and print the shortest path (minimum amount of vertices) from vertex A to vertex B.
Note: using breadth-first search, not Dijkstra's.
What I've got:
A working algorithm that applies BFS on the graph, but no good way of actually printing out the shortest path.
I'm having a hard time distinguishing a vertex in the shortest path from one that is simply run through the algorithm, but not in the shortest path.
For example: Find the shortest path between 0 and 4.
0 connects to 1,2 and 3. 1 connects to 4.
My path turns out to be [0,1,2,3,4] instead of [0,1,4].
I haven't been able to find any threads asking the same question, or any walk-through of BFS that includes this, so I'm not sure if I'm making this out to be way harder than it is?
Edit: code for those who may be interested (not sure at all if I'm avoiding circles?)
Edit 2: Changed the way I store my path to a Stack.
public String findPath(int v, int w) {
Queue<Integer> q = new LinkedList<Integer>();
boolean[] visited = new boolean[g.numVertices()];
q.add(v);
Stack<Integer> path = new Stack<Integer>();
while(q.peek() != null) {
runBFS(q.poll(),w,visited,q,path);
}
return path.toString();
}
private void runBFS(int v, int w, boolean[] visited, Queue<Integer> q, Stack<Integer> path) {
if(visited[v]) {
}
else if(v == w) {
path.add(v);
q.clear();
}
else {
path.add(v);
visited[v] = true;
VertexIterator vi = g.adjacentVertices(v);
while(vi.hasNext()) {
q.add(vi.next());
}
}
}
Some explanation of variables and methods:
v = vertex of origin
w = target vertex
g = graph
vi = a normal iterator that iterates over the neighbours of v
Thanks for reading!

You will have to have specific path field for each vertex. That way you can keep track of the paths you've chosen, hence the short path found. I will use an String array, just like you used the Boolean array for storing visited vertices.
public String findPath(int v, int w) {
Queue<Integer> q = new LinkedList<Integer>();
boolean[] visited = new boolean[g.numVertices()];
String[] pathTo = new String[g.numVertices()];
q.add(v);
pathTo[v] = v+" ";
while(q.peek() != null) {
if(runBFS(q.poll(),w,visited,q,pathTo))
break;
}
return pathTo[w];
}
private boolean runBFS(int v, int w, boolean[] visited, Queue<Integer> q, String[] pathTo) {
if(visited[v]) {
}
else if(v == w)
return true;
}
else {
visited[v] = true;
VertexIterator vi = g.adjacentVertices(v);
while(vi.hasNext()) {
int nextVertex = vi.next();
pathTo[nextVertex] = pathTo[v] + nextVertex + " ";
q.add(nextVertex);
}
}
return false;
}

Another compact (space-wise) solution that us assistants have suggested and doesn't use O(n^2) storage space is to have each node store only which node it came from. This can be done by changing the visited-list to an integer array (int[] visited).
step 1: initialize visited list, so that every element is '-1', or "unvisited"
step 2: mark the first node as visited by itself visited[v] = v;
Do a BFS (like you do, with the following modifications:)
when moving from v -> v_next:
if(visited[v_next] == -1)
{
visited[v_next] = v;
q.put(v_next);
}
// else skip it, it's already been visited
This way, if w is reachable, visited[w] will store which node it came from, from that node, you can backtrack all the way back to v and finally print them in the opposite order. (This is done either using a stack or a recursive print method.)
Hope that makes sense. :)

When you store a vertex in the BFS queue, you also need to store a copy of the path through which it has been reached, so that it will be available once that vertex is dequeued. As it is now, your code does not keep any kind of path information on the queued vertices - it only keeps a list of the nodes it visits.
You could, for example, use a separate queue that will be processed in parallel, in which you will store the current path, and then restore it once you dequeue the next vertex to search.

You need to push your current node onto a stack, and only print the whole stack out once you reach the destination.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.