I have a directed network model made up of a set of nodes connected by links that grows with each model iteration. In order to find the "average shortest path" in the final model iteration, I have implemented Dijkstra's algorithm that calculates the shortest path from all nodes to all nodes. To be more specific, the algorithm calculates the shortest path from each of the networks 3,000 nodes to all other 3,000 nodes (if a path exists), roughly 9,000,000 pathlengths and then finds the average path length. When I try this, I run out of memory. I am able to get an average path length up until about 500 nodes, where roughly 250,000 path lengths are calculated in under 12h. My question is, is there any way to improve the code in a way that might make it more efficient? Or is it not feasible to calculate that many paths?
Code below... the algorithm itself is adapted from Vogella http://www.vogella.com/tutorials/JavaAlgorithmsDijkstra/article.html
Nodes in the nework represent trees and edges or links represent nets.
Dijstra Algorithm
package network;
imports...
public class DijkstraAlgorithm {
private Context<Object> context;
private Geography<Object> geography;
private int id;
List<Tree> vertices = Tree.getVertices();
List<Nets> edges = Nets.getEdges();
private Set<Tree> settledNodes;
private Set<Tree> unSettledNodes;
private Map<Tree, Tree> predecessors;
private Map<Tree, Integer> distance;
public DijkstraAlgorithm(Graph graph) {
this.context = context;
this.geography = geography;
this.id = id;
this.vertices = vertices;
this.edges = edges;
}
setters and getters....
public void execute(Tree source){
settledNodes = new HashSet<Tree>();
unSettledNodes = new HashSet<Tree>();
distance = new HashMap<Tree, Integer>();
predecessors = new HashMap<Tree, Tree>();
distance.put(source, 0);
unSettledNodes.add(source);
while (unSettledNodes.size()>0){
Tree node = getMinimum(unSettledNodes);
settledNodes.add(node);
unSettledNodes.remove(node);
findMinimalDistances(node);
}
}
private void findMinimalDistances(Tree node){
List<Tree>adjacentNodes = getNeighbors(node);
for (Tree target: adjacentNodes){
if (getShortestDistance(target)>getShortestDistance(node)+getDistance(node,target)){
distance.put(target, getShortestDistance(node) + getDistance(node, target));
predecessors.put(target, node);
unSettledNodes.add(target);
}
}
}
private int getDistance(Tree node, Tree target){
for (Nets edge: edges){
if (edge.getStartTrees().equals(node) && edge.getEndTrees().equals(target)){
return edge.getId();
}
}
throw new RuntimeException("Should not happen");
}
private List<Tree> getNeighbors(Tree node){
List<Tree> neighbors = new ArrayList<Tree>();
for (Nets edge: edges) {
if(edge.getStartTrees().equals(node) && !isSettled(edge.getEndTrees())){
neighbors.add(edge.getEndTrees());
}
}
return neighbors;
}
private Tree getMinimum(Set<Tree>vertexes){
Tree minimum = null;
for (Tree vertex: vertexes) {
if (minimum == null){
minimum = vertex;
} else {
if (getShortestDistance(vertex)< getShortestDistance(minimum)){
minimum = vertex;
}
}
}
return minimum;
}
private boolean isSettled(Tree vertex){
return settledNodes.contains(vertex);
}
private int getShortestDistance(Tree destination) {
Integer d = distance.get(destination);
if (d == null) {
return Integer.MAX_VALUE;
} else {
return d;
}
}
public LinkedList<Tree> getPath(Tree target){
LinkedList<Tree>path = new LinkedList<Tree>();
Tree step = target;
if(predecessors.get(step)== null){
return null;
}
path.add(step);
while (predecessors.get(step)!=null){
step = predecessors.get(step);
path.add(step);
}
Collections.reverse(path);
return path;
}
}
Graph
package network;
imports...
public class Graph {
private Context<Object> context;
private Geography<Object> geography;
private int id;
List<Tree> vertices = new ArrayList<>();
List<Nets> edges = new ArrayList<>();
List <Integer> intermediateNodes = new ArrayList<>();
public Graph(Context context, Geography geography, int id, List vertices, List edges) {
this.context = context;
this.geography = geography;
this.id = id;
this.vertices = vertices;
this.edges = edges;
}
setters... getters...
//updates graph
#ScheduledMethod(start =1, interval =1, priority =1)
public void countNodesAndVertices() {
this.setVertices(Tree.getVertices());
this.setEdges(Nets.getEdges());
//run Dijkstra at the 400th iteration of network development
#ScheduledMethod(start =400, priority =1)
public void Dijkstra(){
Graph graph2 = new Graph (context, geography, id, vertices, edges);
graph2.setEdges(this.getEdges());
graph2.setVertices(this.getVertices());
for(Tree t: graph2.getVertices()){
}
DijkstraAlgorithm dijkstra = new DijkstraAlgorithm(graph2);
//create list of pathlengths (links, not nodes)
List <Double> pathlengths = new ArrayList<>();
//go through all nodes as starting nodes
for (int i = 0; i<vertices.size();i++){
//find the shortest path to all nodes as end nodes
for (int j = 0; j<vertices.size();j++){
if(i != j){
Tree startTree = vertices.get(i);
Tree endTree = vertices.get(j);
dijkstra.execute(vertices.get(i));
//create a list that contains the path of nodes
LinkedList<Tree> path = dijkstra.getPath(vertices.get(j));
//if the path is not null and greater than 0
if (path != null && path.size()>0){
//calculate the pathlength (-1, which is the size of the path length of links)
double listsize = path.size()-1;
//add it to the list
pathlengths.add(listsize);
}
}
}
}
calculateAvgShortestPath(pathlengths);
}
//calculate the average
public void calculateAvgShortestPath(List<Double>pathlengths){
Double sum = 0.0;
for (Double cc: pathlengths){
sum+= cc;
}
Double avgPathLength = sum/pathlengths.size();
System.out.println("The average path length is: " + avgPathLength);
}
}
One quick improvement is to move the line:
dijkstra.execute(vertices.get(i));
up 6 lines (so it is in the i loop, but not the j loop).
This should improve runtime by the number of nodes (i.e. 3000 times faster).
It should still give identical results because Dijkstra's algorithm calculates the shortest path from the start node to ALL destination nodes, so there is no need to rerun it for each pair of start/end.
There are several optimizations that you could make. Like for instance using a Fibonacci heap (or even a standard java priority queue) would definitely speed things up. However, the memory issue will likely persist for a dataset that large regardless. The only real way to deal with a dataset that big is to use a distributed implementation. I believe that there is a shortest path implementation in the Spark Graphx library that you could use.
Related
I using this exact code for this. I modified it a little. So far I added a start and end node index to the calculateShortestDistances() method. Also the path ArrayList for collecting the path node indexes. Also: new to Java...
How do I collect the indexes of nodes in the path ArrayList?
I just can't come up with the solution on a level that I am not even positive this code could do what I want. I only have intuition on my side and little time.
What I tried:
Adding the nextNode value to the list then removing it if it was not
a shorter distance.
Adding the neighbourIndex to the list then removing it if it was not a shorter distance.
I made a Path.java with ArrayList but that was went nowhere (it was a class with a public variable named path) but it went nowhere.
Main.java:
public class Main {
public static void main(String[] args) {
Edge[] edges = {
new Edge(0, 2, 1), new Edge(0, 3, 4), new Edge(0, 4, 2),
new Edge(0, 1, 3), new Edge(1, 3, 2), new Edge(1, 4, 3),
new Edge(1, 5, 1), new Edge(2, 4, 1), new Edge(3, 5, 4),
new Edge(4, 5, 2), new Edge(4, 6, 7), new Edge(4, 7, 2),
new Edge(5, 6, 4), new Edge(6, 7, 5)
};
Graph g = new Graph(edges);
g.calculateShortestDistances(4,6);
g.printResult(); // let's try it !
System.out.println(g.path);
}
}
Graph.java:
This is the Graph.java file. Here I added a sAt and eAt variable, so I can tell it what path I am after. Also I created a public path ArrayList, where I intend to collect the path.
import java.util.ArrayList;
// now we must create graph object and implement dijkstra algorithm
public class Graph {
private Node[] nodes;
private int noOfNodes;
private Edge[] edges;
private int noOfEdges;
private int sAt;
private int eAt;
public ArrayList<Integer> path = new ArrayList<>();
public Graph(Edge[] edges) {
this.edges = edges;
// create all nodes ready to be updated with the edges
this.noOfNodes = calculateNoOfNodes(edges);
this.nodes = new Node[this.noOfNodes];
for (int n = 0; n < this.noOfNodes; n++) {
this.nodes[n] = new Node();
}
// add all the edges to the nodes, each edge added to two nodes (to and from)
this.noOfEdges = edges.length;
for (int edgeToAdd = 0; edgeToAdd < this.noOfEdges; edgeToAdd++) {
this.nodes[edges[edgeToAdd].getFromNodeIndex()].getEdges().add(edges[edgeToAdd]);
this.nodes[edges[edgeToAdd].getToNodeIndex()].getEdges().add(edges[edgeToAdd]);
}
}
private int calculateNoOfNodes(Edge[] edges) {
int noOfNodes = 0;
for (Edge e : edges) {
if (e.getToNodeIndex() > noOfNodes)
noOfNodes = e.getToNodeIndex();
if (e.getFromNodeIndex() > noOfNodes)
noOfNodes = e.getFromNodeIndex();
}
noOfNodes++;
return noOfNodes;
}
public void calculateShortestDistances(int startAt, int endAt) {
// node 0 as source
this.sAt = startAt;
this.eAt = endAt;
this.nodes[startAt].setDistanceFromSource(0);
int nextNode = startAt;
// visit every node
for (int i = 0; i < this.nodes.length; i++) {
// loop around the edges of current node
ArrayList<Edge> currentNodeEdges = this.nodes[nextNode].getEdges();
for (int joinedEdge = 0; joinedEdge < currentNodeEdges.size(); joinedEdge++) {
int neighbourIndex = currentNodeEdges.get(joinedEdge).getNeighbourIndex(nextNode);
// only if not visited
if (!this.nodes[neighbourIndex].isVisited()) {
int tentative = this.nodes[nextNode].getDistanceFromSource() + currentNodeEdges.get(joinedEdge).getLength();
if (tentative < nodes[neighbourIndex].getDistanceFromSource()) {
nodes[neighbourIndex].setDistanceFromSource(tentative);
}
}
}
// all neighbours checked so node visited
nodes[nextNode].setVisited(true);
// next node must be with shortest distance
nextNode = getNodeShortestDistanced();
}
}
// now we're going to implement this method in next part !
private int getNodeShortestDistanced() {
int storedNodeIndex = 0;
int storedDist = Integer.MAX_VALUE;
for (int i = 0; i < this.nodes.length; i++) {
int currentDist = this.nodes[i].getDistanceFromSource();
if (!this.nodes[i].isVisited() && currentDist < storedDist) {
storedDist = currentDist;
storedNodeIndex = i;
}
}
return storedNodeIndex;
}
// display result
public void printResult() {
String output = "Number of nodes = " + this.noOfNodes;
output += "\nNumber of edges = " + this.noOfEdges;
output += "\nDistance from "+sAt+" to "+eAt+":" + nodes[eAt].getDistanceFromSource();
System.out.println(output);
}
public Node[] getNodes() {
return nodes;
}
public int getNoOfNodes() {
return noOfNodes;
}
public Edge[] getEdges() {
return edges;
}
public int getNoOfEdges() {
return noOfEdges;
}
}
Addittionally here are the Edge.java and the Node.java classes.
Node.java:
import java.util.ArrayList;
public class Node {
private int distanceFromSource = Integer.MAX_VALUE;
private boolean visited;
private ArrayList<Edge> edges = new ArrayList<Edge>(); // now we must create edges
public int getDistanceFromSource() {
return distanceFromSource;
}
public void setDistanceFromSource(int distanceFromSource) {
this.distanceFromSource = distanceFromSource;
}
public boolean isVisited() {
return visited;
}
public void setVisited(boolean visited) {
this.visited = visited;
}
public ArrayList<Edge> getEdges() {
return edges;
}
public void setEdges(ArrayList<Edge> edges) {
this.edges = edges;
}
}
Edge.java
public class Edge {
private int fromNodeIndex;
private int toNodeIndex;
private int length;
public Edge(int fromNodeIndex, int toNodeIndex, int length) {
this.fromNodeIndex = fromNodeIndex;
this.toNodeIndex = toNodeIndex;
this.length = length;
}
public int getFromNodeIndex() {
return fromNodeIndex;
}
public int getToNodeIndex() {
return toNodeIndex;
}
public int getLength() {
return length;
}
// determines the neighbouring node of a supplied node, based on the two nodes connected by this edge
public int getNeighbourIndex(int nodeIndex) {
if (this.fromNodeIndex == nodeIndex) {
return this.toNodeIndex;
} else {
return this.fromNodeIndex;
}
}
}
I know it looks like a homework. Trust me it isn't. On the other hand I have not much time to finish it, that is why I do it at Sunday. Also I am aware how Dijkstra algorithm works, I understand the concept, I can do it on paper. But collecting the path is beyond me.
Thanks for Christian H. Kuhn's and second's comments I managed to come up with the code.
I modified it as follows (I only put in the relevant parts)
Node.java
Here I added a setPredecessor(Integer predecessor) and a getPredecessor() methods to set and get the value of the private variable predecessor (so I follow the original code's style too).
[...]
private int predecessor;
[...]
public int getPredecessor(){
return predecessor;
}
public void setPredecessor(int predecessor){
this.predecessor = predecessor;
}
[...]
Graph.java
Here I created the calculatePath() and getPath() methods. calculatePath() does what the commenters told me to do. The getPath() returns the ArrayLists for others to use.
[...]
private int sAt;
private int eAt;
private ArrayList<Integer> path = new ArrayList<Integer>();
[...]
public void calculateShortestDistances(int startAt, int endAt) {
[...]
if (tentative < nodes[neighbourIndex].getDistanceFromSource()) {
nodes[neighbourIndex].setDistanceFromSource(tentative);
nodes[neighbourIndex].setPredecessor(nextNode);
}
[...]
public void calculatePath(){
int nodeNow = eAt;
while(nodeNow != sAt){
path.add(nodes[nodeNow].getPredecessor());
nodeNow = nodes[nodeNow].getPredecessor();
}
}
public ArrayList<Integer> getPath(){
return path;
}
[...]
Main.java so here I can do this now:
[...]
Graph g = new Graph(edges);
g.calculateShortestDistances(5,8);
g.calculatePath();
String results = "";
ArrayList<Integer> path = g.getPath();
System.out.println(path);
[...]
I know it shows the path backwards, but that is not a problem, as I can always reverse it. The point is: I not only have the the distance from node to node, but the path through nodes too. Thank you for the help.
I'm working on Dijkstra's algorithm,and I need to find all possible shortest paths. Dijkstra's algorithm returns only one short path, if another path has the same cost I would like to print it. I'm out of ideas, please help me.
Thank you.
Here's my algorithm:
public class Dijkstra {
private static final Graph.Edge[] GRAPH = {
new Graph.Edge("a", "b", 7),
new Graph.Edge("a", "c", 9),
new Graph.Edge("a", "f", 14),
new Graph.Edge("b", "c", 10),
new Graph.Edge("b", "d", 13),
new Graph.Edge("c", "d", 11),
new Graph.Edge("c", "f", 2),
new Graph.Edge("d", "e", 6),
new Graph.Edge("e", "f", 9),
};
private static final String START = "a";
private static final String END = "e";
public static void main(String[] args) {
Graph g = new Graph(GRAPH);
g.dijkstra(START);
g.printPath(END);
//g.printAllPaths();
}
}
import java.io.*;
import java.util.*;
class Graph {
private final Map<String, Vertex>
graph; // mapping of vertex names to Vertex objects, built from a set of Edges
/** One edge of the graph (only used by Graph constructor) */
public static class Edge {
public final String v1, v2;
public final int dist;
public Edge(String v1, String v2, int dist) {
this.v1 = v1;
this.v2 = v2;
this.dist = dist;
}
}
/** One vertex of the graph, complete with mappings to neighbouring vertices */
public static class Vertex implements Comparable<Vertex> {
public final String name;
public int dist = Integer.MAX_VALUE; // MAX_VALUE assumed to be infinity
public Vertex previous = null;
public final Map<Vertex, Integer> neighbours = new HashMap<>();
public Vertex(String name) {
this.name = name;
}
private void printPath() {
if (this == this.previous) {
System.out.printf("%s", this.name);
} else if (this.previous == null) {
System.out.printf("%s(unreached)", this.name);
} else {
this.previous.printPath();
System.out.printf(" -> %s(%d)", this.name, this.dist);
}
}
public int compareTo(Vertex other) {
return Integer.compare(dist, other.dist);
}
}
/** Builds a graph from a set of edges */
public Graph(Edge[] edges) {
graph = new HashMap<>(edges.length);
//one pass to find all vertices
for (Edge e : edges) {
if (!graph.containsKey(e.v1)) graph.put(e.v1, new Vertex(e.v1));
if (!graph.containsKey(e.v2)) graph.put(e.v2, new Vertex(e.v2));
}
//another pass to set neighbouring vertices
for (Edge e : edges) {
graph.get(e.v1).neighbours.put(graph.get(e.v2), e.dist);
//graph.get(e.v2).neighbours.put(graph.get(e.v1), e.dist); // also do this for an undirected graph
}
}
/** Runs dijkstra using a specified source vertex */
public void dijkstra(String startName) {
if (!graph.containsKey(startName)) {
System.err.printf("Graph doesn't contain start vertex \"%s\"\n", startName);
return;
}
final Vertex source = graph.get(startName);
NavigableSet<Vertex> q = new TreeSet<>();
// set-up vertices
for (Vertex v : graph.values()) {
v.previous = v == source ? source : null;
v.dist = v == source ? 0 : Integer.MAX_VALUE;
q.add(v);
}
dijkstra(q);
}
/** Implementation of dijkstra's algorithm using a binary heap. */
private void dijkstra(final NavigableSet<Vertex> q) {
Vertex u, v;
while (!q.isEmpty()) {
u = q.pollFirst(); // vertex with shortest distance (first iteration will return source)
if (u.dist == Integer.MAX_VALUE)
break; // we can ignore u (and any other remaining vertices) since they are unreachable
//look at distances to each neighbour
for (Map.Entry<Vertex, Integer> a : u.neighbours.entrySet()) {
v = a.getKey(); //the neighbour in this iteration
final int alternateDist = u.dist + a.getValue();
if (alternateDist < v.dist) { // shorter path to neighbour found
q.remove(v);
v.dist = alternateDist;
v.previous = u;
q.add(v);
} else if (alternateDist == v.dist) {
// Here I Would do something
}
}
}
}
/** Prints a path from the source to the specified vertex */
public void printPath(String endName) {
if (!graph.containsKey(endName)) {
System.err.printf("Graph doesn't contain end vertex \"%s\"\n", endName);
return;
}
graph.get(endName).printPath();
System.out.println();
}
/** Prints the path from the source to every vertex (output order is not guaranteed) */
public void printAllPaths() {
for (Vertex v : graph.values()) {
v.printPath();
System.out.println();
}
}
public void printAllPaths2() {
graph.get("e").printPath();
System.out.println();
}
}
Have a look into so called k-shortest path algorithms. These solve the problem of enumerating the first, second, ..., kth shortest path in a graph. There are several algorithms in the literature, see for example this paper, or Yen's algorithm.
Note, that most algorithms do not require that you specify k upfront, ie. you can use them to enumerate shortest paths in an increasing order, and stop when the length has strictly increased.
I have a DFS visit recursive method that sometimes throws a StackOverflowError. Since the size of the graph is large (around 20000 vertices), recursive calls are many, and so I tried to run with -Xss10M and everything works.
I'd just like to understand why adding at the beginning of the method a System.out.println, even without -Xss10M, the method doesn't throw any StackOverflowError. How is it possible?
This is the DFS visit method:
private int dfsVisit(Vertex<T> v, int time){
// System.out.println("Hello");
Vertex<T> n;
time++;
v.d = time;
v.color = Vertex.Color.GRAY;
for (Map.Entry<Vertex<T>, Float> a : v.neighbours.entrySet()){
n = a.getKey();
if(n.color == Vertex.Color.WHITE){
n.previous = v;
time = dfsVisit(n, time);
}
}
v.color = Vertex.Color.BLACK;
time++;
v.f = time;
return time;
}
This is the complete code
import java.io.*;
import java.util.*;
class Graph<T> {
private final Map<T, Vertex<T>> graph;
public static class Edge<T>{
public final T v1, v2;
public final float dist;
public Edge(T v1, T v2, float dist) {
this.v1 = v1;
this.v2 = v2;
this.dist = dist;
}
}
public static class Vertex<T> implements Comparable<Vertex>{ // SPOSTARE VAR IST NEL COSTRUTTORE
public enum Color {WHITE, GRAY, BLACK, UNKNOWN};
public final T name;
public float dist;
public Vertex<T> previous;
public final Map<Vertex<T>, Float> neighbours;
public Color color;
public int d, f;
public Vertex(T name) {
this.name = name;
dist = Float.MAX_VALUE;
previous = null;
neighbours = new HashMap<Vertex<T>, Float>(); // adjacency list
color = Color.UNKNOWN;
d = 0;
f = 0;
}
private void printPath() {
if (this == this.previous) {
System.out.print(this.name);
} else if (this.previous == null) {
System.out.print(this.name + " unreached");
} else {
this.previous.printPath();
System.out.print(" -> " + this.name + "(" + this.dist + ")");
}
}
public int compareTo(Vertex other){
if(this.dist == other.dist)
return 0;
else if(this.dist > other.dist)
return 1;
else
return -1;
}
}
// Builds a graph from an array of edges
public Graph(ArrayList<Graph.Edge> edges) {
graph = new HashMap<>(edges.size());
// add vertices
for (Edge<T> e : edges) {
if (!graph.containsKey(e.v1)) graph.put(e.v1, new Vertex<>(e.v1));
if (!graph.containsKey(e.v2)) graph.put(e.v2, new Vertex<>(e.v2));
}
// create adjacency list
for (Edge<T> e : edges) {
graph.get(e.v1).neighbours.put(graph.get(e.v2), e.dist);
graph.get(e.v2).neighbours.put(graph.get(e.v1), e.dist);
}
}
public void dijkstra(T startName) {
if (!graph.containsKey(startName)) {
System.err.println("Graph doesn't contain start vertex " + startName);
return;
}
final Vertex<T> source = graph.get(startName);
NavigableSet<Vertex<T>> q = new TreeSet<>(); // priority queue
// set-up vertices
for (Vertex<T> v : graph.values()) {
v.previous = v == source ? source : null;
v.dist = v == source ? 0 : Float.MAX_VALUE;
q.add(v);
}
dijkstra(q);
}
private void dijkstra(final NavigableSet<Vertex<T>> q) {
Vertex<T> u, v;
while (!q.isEmpty()) {
u = q.pollFirst();
if (u.dist == Float.MAX_VALUE) break; //???????????
for (Map.Entry<Vertex<T>, Float> a : u.neighbours.entrySet()) {
v = a.getKey();
final float alternateDist = u.dist + a.getValue();
if (alternateDist < v.dist) {
q.remove(v);
v.dist = alternateDist;
v.previous = u;
q.add(v);
}
}
}
}
public void printPath(T endName) {
if (!graph.containsKey(endName)) {
System.err.println("Graph doesn't contain end vertex " + "\"" + endName + "\"" );
return;
}
graph.get(endName).printPath();
System.out.println();
}
public void printAllPaths() {
for (Vertex<T> v : graph.values()) {
v.printPath();
System.out.println();
}
}
public Vertex<T> getVertex(T key){
if(graph.containsKey(key))
return graph.get(key);
return null;
}
public void printAdjacencyList(){
System.out.println("Adjacency list:");
for(Vertex<T> v : graph.values()){
System.out.print(v.name + ":\t");
for (Map.Entry<Vertex<T>, Float> a : v.neighbours.entrySet()){
System.out.print(a.getKey().name + "(" + a.getValue() + ") | ");
}
System.out.println();
}
}
/*
P.S. I know that if only used to calculate the connected components of the graph, dfs visit
could be written differently but I preferred to write it in a more general way, so that it
can be reused if necessary.
*/
private int dfsVisit(Vertex<T> v, int time){
// System.out.println("ciao");
Vertex<T> n;
time++;
v.d = time;
v.color = Vertex.Color.GRAY;
for (Map.Entry<Vertex<T>, Float> a : v.neighbours.entrySet()){
n = a.getKey();
if(n.color == Vertex.Color.WHITE){
n.previous = v;
time = dfsVisit(n, time);
}
}
v.color = Vertex.Color.BLACK;
time++;
v.f = time;
return time;
}
/*
Print the size of the connected components of the graph
*/
public void connectedComponents(){
for(Vertex<T> v : graph.values()){
v.color = Vertex.Color.WHITE;
v.previous = null;
}
for(Vertex<T> v : graph.values()){
if(v.color == Vertex.Color.WHITE)
System.out.println(dfsVisit(v, 0)/2);
}
}
}
here's the test class
import java.io.*;
import java.util.*;
public class Dijkstra {
private static ArrayList<Graph.Edge> a = new ArrayList<Graph.Edge>();
private static final String START = "torino";
private static final String END = "catania";
public static void main(String[] args) {
String fileName = "italian_dist_graph.txt";
try{
Scanner inputStream = new Scanner(new File(fileName));
String record;
while(inputStream.hasNextLine()){
record = inputStream.nextLine();
String[] array = record.split(",");
String from = array[0];
String to = array[1];
float dist = Float.parseFloat(array[2]);
a.add(new Graph.Edge(from, to, dist));
}
inputStream.close();
} catch(FileNotFoundException e){
System.out.println("Impossibile trovare il file "+fileName);
}
Graph<String> g = new Graph<String>(a);
g.dijkstra(START);
g.printPath(END);
//System.out.printf("%f\n", g.getVertex(END).dist/1000.0f);
g.connectedComponents();
}
}
N.B. try to comment g.dijkstra(START) and g.printPath(END); everything seems to work.
Here's the link to the data set
https://drive.google.com/open?id=0B7XZY8cd0L_fZVl1aERlRmhQN0k
Some general recommendations:
Your code mixes up attributes of vertices, that are related to a single run of dfs and such that are direct attributes of the vertices. Bad bad bad style. This is quite likely to break any more complex algorithm, can produce unexpected behavior and would require clearing the states after each run, to ensure stability of the code. Instead keep states that are related to a single run of a algorithm only visible to that function. E.g. store the states inside a Map, use the decorator-pattern to create a datastructure that provides additional attributes and that has method-local scope, etc.. As an example: running your code twice on the same graph (same Object) with the same input without clearing all states will lead to a wrong result (1).
In addition: creating an iterative version of DFS isn't exactly hard, so you should give it a try, especially since your graph appears to be pretty large.
As for why your code works (or doesn't) the way it does:
This is hard to tell, since it depends upon quite a lot of factors. You didn't provide full code, so I can't rerun any tests, or verify that everything behaves the way it should. The most likely answers:
Vertex uses the default hash-code provided by Object. This leads to random ordering of the entries in the map of neighbours, thus the order in which specific paths are traversed is random in each run and most likely different. Thus you're traversing the graph using random paths, that quite likely (especially due to the size of your graph) differ for each run. The reason isn't the System.out.println, but the mere fact, that your code generates a different structure (from a ordering-POV, not mathematical), each time it runs plus the coincident, that for some pretty weird reason each build of the graph, that doesn't reach the necessary recursion-depth for a StackOverflow, and the code compiled with System.out.println appeared together.
The Java compiler, or JIT modifies the behavior of the code in a weird way. Modern compilers have the tendency to produce quite weird code in their attempts to optimize everything they can get hold off.
I am implementing DBSCAN in java. I have followed the algorithm given over here (Wikipedia). I think I have it right but for some reason only 1 cluster is formed.
The Java code looks like
ArrayList<ArrayList<Node>> dbcluster = new ArrayList<>();
ArrayList<Node> points; // this contains my data assume
int min =10, esp =50;
int clustcount =0;
for(int i=0;i<points.size();i++){
Node tempdb = points.get(i);
if(tempdb.visited==false){
tempdb.visited=true;
ArrayList<Node> myNeighbors = getNeigbhors(tempdb,points, esp);
if(myNeighbors.size() < min){
tempdb.noise = true;
}else{
//ArrayList<Node> tempclust = new ArrayList<>();
dbcluster.add(new ArrayList<Node>());
expandCluster(tempdb,points,myNeighbors,dbcluster,esp,min,clustcount);
clustcount++;
}
}
public static ArrayList<Node> getNeigbhors(Node p ,ArrayList<Node> data,int esp){
ArrayList<Node> tempReturn = new ArrayList<>();
for(int i=0;i<data.size();i++){
Node temptemp = data.get(i);
if(p.x != temptemp.x && p.y !=temptemp.y){
double distance =Math.sqrt(((p.x - temptemp.x)*(p.x - temptemp.x))+((p.y - temptemp.y)*(p.y - temptemp.y)));
if(distance <=esp){
tempReturn.add(temptemp);
}
}
}
return tempReturn;
}
public static void expandCluster(Node p, ArrayList<Node> data, ArrayList<Node> N, ArrayList<ArrayList<Node>> allCluster,int esp, int min,int clustcount){
//ArrayList<Node> tempSmallClust = new ArrayList<>();
//tempSmallClust.add(p);
allCluster.get(clustcount).add(p);
for(int i=0;i<N.size();i++){
Node tempP = N.get(i);
if(tempP.visited == false){
tempP.visited=true;
ArrayList<Node> tempNewNeighbors = new ArrayList<>();
tempNewNeighbors = getNeigbhors(tempP, data, esp);
if(tempNewNeighbors.size() >= min){
ArrayList<Node> tempN=new ArrayList<>();
tempN=mergeNeighbors(N, tempNewNeighbors);
N = new ArrayList<>();
N=tempN;
}
}
if(!checkInCluster(tempP,allCluster)) {
allCluster.get(clustcount).add(tempP);
}
}
//return tempSmallClust;
}
public static boolean checkInCluster(Node p, ArrayList<ArrayList<Node>> allCluster){
for(int i=0;i<allCluster.size();i++){
ArrayList<Node> tempList = allCluster.get(i);
if(tempList.contains(p)){
return true;
}
}
return false;
}
public static ArrayList<Node> mergeNeighbors(ArrayList<Node> N,ArrayList<Node> NewN){
ArrayList<Node> tmpR = N;
for(int i=0;i<NewN.size();i++){
if(!N.contains(NewN.get(i))){
tmpR.add(NewN.get(i));
}
}
return tmpR;
}
The node class is simple with int x int y and boolean noise and visited.
the data is provided here here data
Your data is uniform distributed.
There are no clusters in this data set.
IMHO the correct of putting everything in one cluster (if eps too high and minpts too low) is correct, or putting everything into noise (if eps and minpts set appropriately).
So maybe it's not an implementation error, but your data is not appropriate.
I'm back with another similar question. I am currently working on a Java program that will check if a graph is 2-colorable, i.e. if it contains no odd cycles (cycles of odd number length). The entire algorithm is supposed to run in O(V+E) time (V being all vertices and E being all edges in the graph). My current algorithm does a Depth First Search, recording all vertices in the path it takes, then looks for a back edge, and then records between which vertices the edge is between. Next it traces a path from one end of the back edge until it hits the other vertex on the other end of the edge, thus retracing the cycle that the back edge completes.
I was under the impression that this kind of traversing could be done in O(V+E) time for all cycles that exist in my graph, but I must be missing something, because my algorithm is running for a ridiculously long time for very large graphs (10k nodes, no idea how many edges).
Is my algorithm completely wrong? And if so, can anyone point me in the right direction for a better way to record these cycles or possibly tell if they have odd numbers of vertices? Thanks for any and all help you guys can give. Code is below if you need it.
Addition: Sorry I forgot, if the graph is not 2-colorable, I need to provide an odd cycle that proves that it is not.
package algorithms311;
import java.util.*;
import java.io.*;
public class CS311 {
public static LinkedList[] DFSIter(Vertex[] v) {
LinkedList[] VOandBE = new LinkedList[2];
VOandBE[0] = new LinkedList();
VOandBE[1] = new LinkedList();
Stack stack = new Stack();
stack.push(v[0]);
v[0].setColor("gray");
while(!stack.empty()) {
Vertex u = (Vertex) stack.peek();
LinkedList adjList = u.getAdjList();
VOandBE[0].add(u.getId());
boolean allVisited = true;
for(int i = 0; i < adjList.size(); i++) {
if(v[(Integer)adjList.get(i)].getColor().equals("white")) {
allVisited = false;
break;
}
else if(v[(Integer)adjList.get(i)].getColor().equals("gray") && u.getPrev() != (Integer)adjList.get(i)) {
int[] edge = new int[2]; //pair of vertices
edge[0] = u.getId(); //from u
edge[1] = (Integer)adjList.get(i); //to v
VOandBE[1].add(edge);
}
}
if(allVisited) {
u.setColor("black");
stack.pop();
}
else {
for(int i = 0; i < adjList.size(); i++) {
if(v[(Integer)adjList.get(i)].getColor().equals("white")) {
stack.push(v[(Integer)adjList.get(i)]);
v[(Integer)adjList.get(i)].setColor("gray");
v[(Integer)adjList.get(i)].setPrev(u.getId());
break;
}
}
}
}
return VOandBE;
}
public static void checkForTwoColor(String g) { //input is a graph formatted as assigned
String graph = g;
try {
// --Read First Line of Input File
// --Find Number of Vertices
FileReader file1 = new FileReader("W:\\Documents\\NetBeansProjects\\algorithms311\\src\\algorithms311\\" + graph);
BufferedReader bReaderNumEdges = new BufferedReader(file1);
String numVertS = bReaderNumEdges.readLine();
int numVert = Integer.parseInt(numVertS);
System.out.println(numVert + " vertices");
// --Make Vertices
Vertex vertex[] = new Vertex[numVert];
for(int k = 0; k <= numVert - 1; k++) {
vertex[k] = new Vertex(k);
}
// --Adj Lists
FileReader file2 = new FileReader("W:\\Documents\\NetBeansProjects\\algorithms311\\src\\algorithms311\\" + graph);
BufferedReader bReaderEdges = new BufferedReader(file2);
bReaderEdges.readLine(); //skip first line, that's how many vertices there are
String edge;
while((edge = bReaderEdges.readLine()) != null) {
StringTokenizer ST = new StringTokenizer(edge);
int vArr[] = new int[2];
for(int j = 0; ST.hasMoreTokens(); j++) {
vArr[j] = Integer.parseInt(ST.nextToken());
}
vertex[vArr[0]-1].addAdj(vArr[1]-1);
vertex[vArr[1]-1].addAdj(vArr[0]-1);
}
LinkedList[] l = new LinkedList[2];
l = DFSIter(vertex);//DFS(vertex);
System.out.println(l[0]);
for(int i = 0; i < l[1].size(); i++) {
int[] j = (int[])l[1].get(i);
System.out.print(" [" + j[0] + ", " + j[1] + "] ");
}
LinkedList oddCycle = new LinkedList();
boolean is2Colorable = true;
//System.out.println("iterate through list of back edges");
for(int i = 0; i < l[1].size(); i++) { //iterate through the list of back edges
//System.out.println(i);
int[] q = (int[])(l[1].get(i)); // q = pair of vertices that make up a back edge
int u = q[0]; // edge (u,v)
int v = q[1];
LinkedList cycle = new LinkedList();
if(l[0].indexOf(u) < l[0].indexOf(v)) { //check if u is before v
for(int z = l[0].indexOf(u); z <= l[0].indexOf(v); z++) { //if it is, look for u first; from u to v
cycle.add(l[0].get(z));
}
}
else if(l[0].indexOf(v) < l[0].indexOf(u)) {
for(int z = l[0].indexOf(v); z <= l[0].indexOf(u); z++) { //if it is, look for u first; from u to v
cycle.add(l[0].get(z));
}
}
if((cycle.size() & 1) != 0) { //if it has an odd cycle, print out the cyclic nodes or write them to a file
is2Colorable = false;
oddCycle = cycle;
break;
}
}
if(!is2Colorable) {
System.out.println("Graph is not 2-colorable, odd cycle exists");
if(oddCycle.size() <= 50) {
System.out.println(oddCycle);
}
else {
try {
BufferedWriter outFile = new BufferedWriter(new FileWriter("W:\\Documents\\NetBeansProjects\\algorithms311\\src\\algorithms311\\" + graph + "OddCycle.txt"));
String cyc = oddCycle.toString();
outFile.write(cyc);
outFile.close();
}
catch (IOException e) {
System.out.println("Could not write file");
}
}
}
}
catch (IOException e) {
System.out.println("Could not open file");
}
System.out.println("Done!");
}
public static void main(String[] args) {
//checkForTwoColor("smallgraph1");
//checkForTwoColor("smallgraph2");
//checkForTwoColor("smallgraph3");
//checkForTwoColor("smallgraph4");
checkForTwoColor("smallgraph5");
//checkForTwoColor("largegraph1");
}
}
Vertex class
package algorithms311;
import java.util.*;
public class Vertex implements Comparable {
public int id;
public LinkedList adjVert = new LinkedList();
public String color = "white";
public int dTime;
public int fTime;
public int prev;
public boolean visited = false;
public Vertex(int idnum) {
id = idnum;
}
public int getId() {
return id;
}
public int compareTo(Object obj) {
Vertex vert = (Vertex) obj;
return id-vert.getId();
}
#Override public String toString(){
return "Vertex # " + id;
}
public void setColor(String newColor) {
color = newColor;
}
public String getColor() {
return color;
}
public void setDTime(int d) {
dTime = d;
}
public void setFTime(int f) {
fTime = f;
}
public int getDTime() {
return dTime;
}
public int getFTime() {
return fTime;
}
public void setPrev(int v) {
prev = v;
}
public int getPrev() {
return prev;
}
public LinkedList getAdjList() {
return adjVert;
}
public void addAdj(int a) { //adds a vertex id to this vertex's adj list
adjVert.add(a);
}
public void visited() {
visited = true;
}
public boolean wasVisited() {
return visited;
}
}
I was under the impression that this kind of traversing could be done in O(V+E) time for all cycles that exist in my graph
There may be much more cycles than O(V+E) in a graph. If you iterate all of them, you will run long.
Back to your original idea, you could just try to implement a straightforward algorithm to color graph in two colors (mark an arbitrary node as black, all neighbors in white, all their neighbors in black, etc; that would be a breadth-first search). That is indeed done in O(V+E) time. If you succeed, then graph is 2-colorable. If you fail, it's not.
Edit: If you need a cycle that proves graph is not 2-colorable, just record for each node the vertex you traversed into it from. When you happen to traverse from black vertex A to black vertex B (thus needing to color black B into white and proving your graph is not 2-colorable), you get the cycle by looking back to parents:
X -> Y -> Z -> U -> V -> P -> Q -> A
\-> D -> E -> B
Then, A-B-E-D-V-P-Q (the paths up to their common ancestor) is the cycle you needed.
Note that in this version you don't have to check all cycles, you just output a first cycle, where back-edge in the tree has both vertexes colored in the same color.
you are describing a bipartite graph. a bipartite graph is 2 colorable and it contains no odd length cycles. You can use BFS to prove that a graph is bipartite or not. Hope this helps.