Longest path in directed acyclic graph in jgrapht library - java

I want to find the longest path in directed (acyclic) graph. Let's say that I know starting node - sink. The path should begin from this point.
I was thinking that I can set weights of edges to -1. There is many methods of finding all shortest path but you have to pass ending point. Is it possible to get the shortest path (no matter the end node)?
DirectedAcyclicGraph graph = new DirectedAcyclicGraph<Integer, DefaultEdge>(DefaultEdge.class);
graph.addVertex(1);
graph.addVertex(2);
graph.addVertex(3);
graph.addVertex(4);
graph.addVertex(5);
graph.addVertex(6);
graph.addVertex(7);
graph.addVertex(8);
try {
graph.addDagEdge(1, 2);
graph.addDagEdge(2, 3);
graph.addDagEdge(3, 4);
graph.addDagEdge(5, 6);
graph.addDagEdge(2, 7);
graph.addDagEdge(7, 8);
} catch(Exception e) {
}
//????????????????
Let's say that I'd like to find longest path for node nr 1 (sink). So this algoritm shoud give me 1-2-3-4-5-6.

I was looking for an answer to a similar question to calculate parallel build groupings in Jenkins from a DAG of Git repos. To solve it, I applied the algorithm described here and here. The code below is written in Groovy, so you'll have to convert to Java. The result is a Map of vertices to their respective max depths. From that you can get the single largest value. If instead you want to know the max depth from a specific vertex in the graph, you can first prune the graph into a subgraph rooted by your desired source vertex and then run the method below on the subgraph.
def calcDepths(g) {
Map<String, Integer> vertexToDepthMap = new HashMap<>()
Iterator<String> iterator = new TopologicalOrderIterator<String, DefaultEdge>(g)
iterator.each { v ->
Set<String> predecessors = Graphs.predecessorListOf(g, v).toSet()
Integer maxPredecessorDepth = -1
predecessors.each { predecessor ->
maxPredecessorDepth = Math.max(maxPredecessorDepth, vertexToDepthMap.get(predecessor))
}
vertexToDepthMap.put(v, maxPredecessorDepth + 1)
}
return vertexToDepthMap
}

You can use AllDirectedPaths algorithm to resolve all paths, sort the results by path length in reverse order and get the first:
AllDirectedPaths<String, DefaultEdge> paths = new AllDirectedPaths<String, DefaultEdge>(graph);
GraphPath<String, DefaultEdge> longestPath = paths.getAllPaths(source, target, true, null)
.stream()
.sorted((GraphPath<String, DefaultEdge> path1, GraphPath<String, DefaultEdge> path2)-> new Integer(path2.getLength()).compareTo(path1.getLength()))
.findFirst().get();
System.out.println(longestPath.getLength() + " " + longestPath);

Related

Neo4j. Dijkstra. Find paths including a set of nodes (graphalgo)

Now I'm testing dijkstra (org.neo4j.graphalgo.impl.shortestpath.*). The code you can see below:
Dijkstra<Double> dijkstra = new Dijkstra<>(0.0,
startNode,
endNode,
CommonEvaluators.doubleCostEvaluator("weight"),
new DoubleAdder(),
new DoubleComparator(),
Direction.BOTH,
RelationshipTypes.rel);
How I can define nodes which must be included in path? Any ideas?
You will have to brute force all possible paths. As long you dont have to many nodes you want to include you can use this approach. It is basically the traveling sales man problem.
what you can do:
Find all permutations of nodes (so you order the nodes in all possible ways)
Start a shortest path for every permutations going from the first to the second node and so on
Compare the weigth of all paths and take the shortest one
Keep in mind to keep the amount of nodes as small as possible because the TSP is an NP-hard problem. So dont go to close to like 10 nodes to include.
I already requested this feature on github also with some code.
Here is my request.
As Yoshi said, this is a TSP. In your solution, you might run into trouble when going through all possible paths as the amount increases quickly on big graphs.
My idea to possibly improve performance would be to run a Dijkstra between each of the nodes you want to have included in your final path and then create a new graph, just with those nodes and the newly calculated distances as weights in it and run the Dijkstra algorithm and the analyzis of wether all nodes are included in a given path on this (considerably) smaller graph.
So I've found the simple solution. Just made any changes. Not sure that it's correct but it works.
Dijkstra<Double> dijkstra = new Dijkstra<>(0.0,
startPoint,
endPoint,
new CostEvaluator<Double>() {
#Override
public Double getCost(final Relationship relationship, Direction direction) {
if (listIncAll.contains(relationship.getStartNode()) || listIncAll.contains(relationship.getEndNode())) {
return 0.0;
}else {
return relationship.getProperty("weight");
}
}
},
new DoubleAdder(),
new DoubleComparator(),
Direction.BOTH,
RelationshipTypes.rel);
Map<List<PropertyContainer>, Double> pathMap = new HashMap<>();
for (List<PropertyContainer> path : dijkstra.getPaths()) {
if (path.containsAll(listIncAll)) {
double fullWeight = 0;
Iterator<PropertyContainer> i = path.iterator();
while (i.hasNext()){
i.next();
if (i.hasNext()) {
Relationship r = (Relationship) i.next();
fullWeight += r.getProperty("weight");
}
}
pathMap.put(path,fullWeight);
}
}
System.out.println(pathMap.entrySet().stream().sorted((k1, k2) -> -k2.getValue().
compareTo(k1.getValue())).findFirst().get().getKey());
So using CostEvaluator I've found paths which have at least one point from a list of points which must be included. To get paths which have all points I've just added "if (path.containsAll(listIncAll))". In last sout you'll see the found path.

How to modify this Held-Karp algorithm to search for Hamiltonian path instead of cycle?

I implemented Held-Karp in Java following Wikipedia and it gives the correct solution for total distance of a cycle, however I need it to give me the path (it doesn't end on the same vertex where is started). I can get path if I take out the edge with largest weight from the cycle, but there is a possibility that 2 different cycles have same total distance, but different maximum weight, therefore one of the cycles is wrong.
Here is my implementation:
//recursion is called with tspSet = [0, {set of all other vertices}]
private static TSPSet recursion (TSPSet tspSet) {
int end = tspSet.endVertex;
HashSet<Integer> set = tspSet.verticesBefore;
if (set.isEmpty()) {
TSPSet ret = new TSPSet(end, new HashSet<>());
ret.secondVertex = -1;
ret.totalDistance = matrix[end][0];
return ret;
}
int min = Integer.MAX_VALUE;
int minVertex = -1;
HashSet<Integer> copy;
for (int current: set) {
copy = new HashSet<>(set);
copy.remove(current);
TSPSet candidate = new TSPSet(current, copy);
int distance = matrix[end][current] + recursion(candidate).totalDistance;
if (distance < min) {
min = distance;
minVertex = current;
}
}
tspSet.secondVertex = minVertex;
tspSet.totalDistance = min;
return tspSet;
}
class TSPSet {
int endVertex;
int secondVertex;
int totalDistance;
HashSet<Integer> verticesBefore;
public TSPSet(int endVertex, HashSet<Integer> vertices) {
this.endVertex = endVertex;
this.secondVertex = -1;
this.verticesBefore = vertices;
}
}
You can slightly alter the dynamic programming state.
Let the path start in a node S. Let f(subset, end) be the optimal cost of the path that goes through all the vertices in the subset and ends in the end vertex (S and end must always be in the subset). A transition is just adding a new vertex V not the subset by using the end->V edge.
If you need a path that ends T, the answer is f(all vertices, T).
A side note: what you're doing now is not a dynamic programming. It's an exhaustive search as you do not memoize answers for subsets and end up checking all possibilities (which results in O(N! * Poly(N)) time complexity).
Problem with current approach
Consider this graph:
The shortest path visiting all vertices (exactly once each) is of length 3, but the shortest cycle is 1+100+200+300, which is 301 even if you remove the maximum weight edge.
In other words, it is not correct to construct the shortest path by deleting an edge from the shortest cycle.
Suggested approach
An alternative approach to convert your cycle algorithm into a path algorithm is to add a new node to the graph which has a zero cost edge to all of the other nodes.
Any path in the original graph corresponds to a cycle in this graph (the start and end points of the path are the nodes that the extra node connects to.

Connecting nodes in a matrix for a graph

Hi I'm working on this little project which requires me to build a matrix in Java which resembles a chess board. I'm supposed to get the Knight to get from a point to another(In the way Knight moves). So I need to find the shortest way to get there in the end.
My problem is, I can't get to connect the edges to get to that point. I can find out if the vertex is a valid move but I can't seem to find a way to create nodes to get to that point. For Example,
0 XXXXX
1 XXXOX
2 XXXXX
3 XXKXX
4 XXXXX
5 XXXXX
I need to create nodes that connect K to O to find out shortest distance later.
PS. I'll be okay with just hints of how to get there or just some tips. Don't really need the exact code. Thank you very much!
I know it's a bad representation of matrix up there but spare me the critique please
A classic Breadth-First-Search is probably the simplest approach:
class Location {
int x;
int y;
List<Location> adjacent() {
// TODO return list of locations reachable in a single step
}
}
List<Location> findShortestPath(Location start, Location destination) {
Location[][] previous = new Location[8][8];
Deque<Location> queue = new ArrayDeque<>();
queue.add(start);
do {
Location loc = queue.poll();
for (Location n : loc.neighbors()) {
if (previous[n.x][n.y] == null) {
previous[n.x][n.y] = loc;
queue.add(n);
if (n.x == destination.x && n.y == destination.y) {
// we've found a way, let's reconstruct the list of steps
List<Location> path = new ArrayList<>();
for (Location l = n; l != start; l = previous[l.x][l.y]) {
path.add(l);
}
path.reverse();
return path;
}
}
}
} while (!queue.isEmpty());
return null; // no path exists
}
This code enumerates all paths from the start location. Therefore, if there is a path to destination, it will find it. In addition, because paths are enumerated in order or ascending length, the first such path will be a shortest one.
The chess board can be implemented by a 2d array. Each cell in the matrix can be considered to be a node (or vertex) in the graph. Edge is composed of two nodes (in this case two cells) one being the from or source [ lets call it Nod A] and other being the to or neighbor or destination node [ Lets call it node B].
Edge exits if there is a possibility of moving from node A to node B.
You can use Dijkstra's algorithm.
http://krishnalearnings.blogspot.in/2015/07/implementation-in-java-for-dijkstras.html
For Node with the Knight's position you can see the possibilities of the cells where Knight can move to and add in the Min Heap. The weight of each edge is constant. You just need to update the cost of the Node.

A* Pathfinding Algorithm -- finding a path, but not optimal best path

I'm working on an A* algorithm. This is the code for the pathfinding method. For reference, this is the board I am working with: http://i.imgur.com/xaAzNSw.png?1 Each color tile represents a different heuristic value. For some unknown reason, it finds a path every single time, just not always the correct path. Here is the code for the pathfinding method. If anyone needs any clarifications, I'd be happy to provide them.
public List<Point> findPath(Point start, Point end) {
//declarations and instantiations
List<PathState> closedList = new ArrayList<PathState>(); //the nodes already considered
List<PathState> openList = new ArrayList<PathState>(); //nodes to be considered
openList.add(new PathState(start, end, tm)); //add starting point
PathState current = openList.get(0);
while(!current.isGoal()){
//sort open list to find the pathstate with the best hscore(sorts by hscore)
Collections.sort(openList);
current = openList.get(openList.size() - 1);
closedList.add(current);
openList.remove(current);
//get the valid children of current node
List<PathState> children = current.getChildren();;
if(!current.isGoal()){
for(int i = 0; i < children.size(); i++){
if(!closedList.contains(children.get(i))){
if(openList.contains(children.get(i))){
if(openList.get(openList.indexOf(children.get(i))).getgScore() > children.get(i).getgScore()){
//child is already on the open list, but this node has lower g value
//change g value and parent of node on open list
openList.get(openList.indexOf(children.get(i))).setG(children.get(i).getgScore());
openList.get(openList.indexOf(children.get(i))).changeParent(current);
}
}else{
//child not in closed list
openList.add(children.get(i));
//repaint the terrain panel with shades
tp.addAstarState(children.get(i));
tp.repaint();
try {
Thread.sleep(25);
} catch(Exception e) {
e.printStackTrace();
}
}
}
}
}
}
//returns the path from winning node to start for output
return current.getPath();
}
A* is basically Djikstra's algorithm but you direct your search to the destination by combining your distance function with an estimate of the remaining distance.
At node x, the cost or "score" is (distance_so_far) for djikstra
at A*, it is (distance_so_far + estimate_of_remaining_distance)
To make sure that A* finds the shortest path, the estimate_of_remaining_distance must be a true lower bound. That means it must always be less than the actual remaining distance. That means, you can never overestimate this distance otherwise it becomes an inexact heuristic in which case it won't necessarily find the shortest path.
This is a good reference: http://theory.stanford.edu/~amitp/GameProgramming/AStarComparison.html
It links to this reference which explains more about heuristic functions: http://theory.stanford.edu/~amitp/GameProgramming/Heuristics.html

Efficient algorithm to find all the paths from A to Z?

With a set of random inputs like this (20k lines):
A B
U Z
B A
A C
Z A
K Z
A Q
D A
U K
P U
U P
B Y
Y R
Y U
C R
R Q
A D
Q Z
Find all the paths from A to Z.
A - B - Y - R - Q - Z
A - B - Y - U - Z
A - C - R - Q - Z
A - Q - Z
A - B - Y - U - K - Z
A location cannot appear more than once in the path, hence A - B - Y - U - P - U - Z is not valid.
Locations are named AAA to ZZZ (presented here as A - Z for simplicity) and the input is random in such a way that there may or may not be a location ABC, all locations may be XXX (unlikely), or there may not be a possible path at all locations are "isolated".
Initially I'd thought that this is a variation of the unweighted shortest path problem, but I find it rather different and I'm not sure how does the algorithm there apply here.
My current solution goes like this:
Pre-process the list such that we have a hashmap which points a location (left), to a list of locations (right)
Create a hashmap to keep track of "visited locations". Create a list to store "found paths".
Store X (starting-location) to the "visited locations" hashmap.
Search for X in the first hashmap, (Location A will give us (B, C, Q) in O(1) time).
For-each found location (B, C, Q), check if it is the final destination (Z). If so store it in the "found paths" list. Else if it doesn't already exist in "visited locations" hashmap, Recurl to step 3 now with that location as "X". (actual code below)
With this current solution, it takes forever to map all (not shortest) possible routes from "BKI" to "SIN" for this provided data.
I was wondering if there's a more effective (time-wise) way of doing it. Does anyone know of a better algorithm to find all the paths from an arbitrary position A to an arbitrary position Z ?
Actual Code for current solution:
import java.util.*;
import java.io.*;
public class Test {
private static HashMap<String, List<String>> left_map_rights;
public static void main(String args[]) throws Exception {
left_map_rights = new HashMap<>();
BufferedReader r = new BufferedReader(new FileReader("routes.text"));
String line;
HashMap<String, Void> lines = new HashMap<>();
while ((line = r.readLine()) != null) {
if (lines.containsKey(line)) { // ensure no duplicate lines
continue;
}
lines.put(line, null);
int space_location = line.indexOf(' ');
String left = line.substring(0, space_location);
String right = line.substring(space_location + 1);
if(left.equals(right)){ // rejects entries whereby left = right
continue;
}
List<String> rights = left_map_rights.get(left);
if (rights == null) {
rights = new ArrayList<String>();
left_map_rights.put(left, rights);
}
rights.add(right);
}
r.close();
System.out.println("start");
List<List<String>> routes = GetAllRoutes("BKI", "SIN");
System.out.println("end");
for (List<String> route : routes) {
System.out.println(route);
}
}
public static List<List<String>> GetAllRoutes(String start, String end) {
List<List<String>> routes = new ArrayList<>();
List<String> rights = left_map_rights.get(start);
if (rights != null) {
for (String right : rights) {
List<String> route = new ArrayList<>();
route.add(start);
route.add(right);
Chain(routes, route, right, end);
}
}
return routes;
}
public static void Chain(List<List<String>> routes, List<String> route, String right_most_currently, String end) {
if (right_most_currently.equals(end)) {
routes.add(route);
return;
}
List<String> rights = left_map_rights.get(right_most_currently);
if (rights != null) {
for (String right : rights) {
if (!route.contains(right)) {
List<String> new_route = new ArrayList<String>(route);
new_route.add(right);
Chain(routes, new_route, right, end);
}
}
}
}
}
As I understand your question, Dijkstras algorithm cannot be applied as is, since shortest path problem per definition finds a single path in a set of all possible paths. Your task is to find all paths per-se.
Many optimizations on Dijkstras algorithm involve cutting off search trees with higher costs. You won't be able to cut off those parts in your search, as you need all findings.
And I assume you mean all paths excluding circles.
Algorithm:
Pump network into a 2dim array 26x26 of boolean/integer. fromTo[i,j].
Set a 1/true for an existing link.
Starting from the first node trace all following nodes (search links for 1/true).
Keep visited nodes in a some structure (array/list). Since maximal
depth seems to be 26, this should be possible via recursion.
And as #soulcheck has written below, you may think about cutting of paths you have aleady visted. You may keep a list of paths towards the destination in each element of the array. Adjust the breaking condition accordingly.
Break when
visiting the end node (store the result)
when visiting a node that has been visted before (circle)
visiting a node for which you have already found all paths to the destination and merge your current path with all the existing ones from that node.
Performance wise I'd vote against using hashmaps and lists and prefer static structures.
Hmm, while re-reading the question, I realized that the name of the nodes cannot be limited to A-Z. You are writing something about 20k lines, with 26 letters, a fully connected A-Z network would require far less links. Maybe you skip recursion and static structures :)
Ok, with valid names from AAA to ZZZ an array would become far too large. So you better create a dynamic structure for the network as well. Counter question: regarding performance, what is the best data structure for a less popuplate array as my algorithm would require? I' vote for an 2 dim ArrayList. Anyone?
What you're proposing is a scheme for DFS, only with backtracking.It's correct, unless you want to permit cyclic paths (you didn't specify if you do).
There are two gotchas, though.
You have to keep an eye on nodes you already visited on current path (to eliminate cycles)
You have to know how to select next node when backtracking, so that you don't descend on the same subtree in the graph when you already visited it on the current path.
The pseudocode is more or less as follows:
getPaths(A, current_path) :
if (A is destination node): return [current_path]
for B = next-not-visited-neighbor(A) :
if (not B already on current path)
result = result + getPaths(B, current_path + B)
return result
list_of_paths = getPaths(A, [A])
which is almost what you said.
Be careful though, as finding all paths in complete graph is pretty time and memory consuming.
edit
For clarification, the algorithm has Ω(n!) time complexity in worst case, as it has to list all paths from one vertex to another in complete graph of size n, and there are at least (n-2)! paths of form <A, permutations of all nodes except A and Z, Z>. No way to make it better if only listing the result would take as much.
Your data is essentially an adjacency list which allows you to construct a tree rooted at the node corresponding to A. In order to obtain all the paths between A & Z, you can run any tree traversal algorithm.
Of course, when you're building the tree you have to ensure that you don't introduce cycles.
I would proceed recursively where I would build a list of all possible paths between all pairs of nodes.
I would start by building, for all pairs (X, Y), the list L_2(X, Y) which is the list of paths of length 2 that go from X to Y; that's trivial to build since that's the input list you are given.
Then I would build the lists L_3(X, Y), recursively, using the known lists L_2(X, Z) and L_2(Z, Y), looping over Z. For example, for (C, Q), you have to try all Z in L_2(C, Z) and L_2(Z, Q) and in this case Z can only be R and you get L_3(C, Q) = {C -> R -> Q}. For other pairs, you might have an empty L_3(X, Y), or there could be many paths of length 3 from X to Y.
However you have to be careful here when building the paths here since some of them must be rejected because they have cycles. If a path has twice the same node, it is rejected.
Then you build L_4(X, Y) for all pairs by combining all paths L_2(X, Z) and L_3(Z, Y) while looping over all possible values for Z. You still remove paths with cycles.
And so on... until you get to L_17576(X, Y).
One worry with this method is that you might run out of memory to store those lists. Note however that after having computed the L_4's, you can get rid of the L_3's, etc. Of course you don't want to delete L_3(A, Z) since those paths are valid paths from A to Z.
Implementation detail: you could put L_3(X, Y) in a 17576 x 17576 array, where the element at (X, Y) is is some structure that stores all paths between (X, Y). However if most elements are empty (no paths), you could use instead a HashMap<Pair, Set<Path>>, where Pair is just some object that stores (X, Y). It's not clear to me if most elements of L_3(X, Y) are empty, and if so, if it is also the case for L_4334(X, Y).
Thanks to #Lie Ryan for pointing out this identical question on mathoverflow. My solution is basically the one by MRA; Huang claims it's not valid, but by removing the paths with duplicate nodes, I think my solution is fine.
I guess my solution needs less computations than the brute force approach, however it requires more memory. So much so that I'm not even sure it is possible on a computer with a reasonable amount of memory.

Categories