Which Tree data structure in Java allows querying for different levels of children? I have looked at TreeNode, JTree. But they dont seem to support multi level querying.
Given a Tree, for a specific node, I want to get the descendants up to a certain level n. Is there an existing implementation that I can use or should I write my own?
Thanks!
It's not that hard to write a breadth-first traversal and visit all the children up to a specified level. Here is some pseudocode. Assume you have a new class:
public class NodeWithLevel {
Node node;
int level;
}
This class is only a wrapper used for this algorithm.
Then the "get all nodes up to level N" method would be:
Queue<NodeWithLevel> queue;
queue.enqueue(<0, tree.root>);
currentLevel = 0;
while(currentLevel < N) {
NodeWithLevel current = queue.dequeue();
currentLevel = current.level;
// do whatever with current
for(Node child: current.node.children) {
queue.enqueue(<current.level + 1, child>);
}
}
DefaultMutableTreeNode supports several traversals, using any one of them to reach your goal is left (no pun intended, it's by the api :) to the user.
If you're not afraid of a complex API, the DOM might be what you need. You can query it through XPath, apply events to its nodes, etc...
The only thing that springs to mind is swing's DefaultTreeModel but that would still require a bit of coding on your side for the logic to get children up to a certain level.
It shouldn't be too hard to roll your own implementation.
Related
I'm learning about search algorithms BFS and DFS. I plan to implement both but before I do that, I need to implement my graph structure. Here's my idea:
A graph of connecting cities: Each city is represented by a Node.
Our graph will simply be an ArrayList of Nodes added as they're created, and each Node will have a list of it's neighbors, and a parent which will let us know where we came from (for path retrieval). I haven't coded anything up yet, I wanted to get some feedback on my ideas before spending the time writing up something that won't work. Here's some pseudocode-ish, code. One potential problem I can see is how we're going to deal with Nodes that we can get to from multiple places (multiple parents). If anyone has any suggestions on dealing with that, feel free to share.
public class Node{
String name;
Node parent;
ArrayList<Node> neighbors;
public addNeighbor(Node n);
public setParent(Node n);
public getNeighbors()
...
}
public static void main(String[] args){
ArrayList<Node> graph = new ArrayList<Node>();
//build node
Node node = new Node(String name);
//add neighbors
node.addNeighbor(neighbor1);
node.addNeighbor(neighbor2);
//set parent
node.setParent(parent1);
//add to graph
graph.add(node);
path = dfs(graph, startNode, goalNode);
System.out.print(path);
}
Edit: I know I could look online and find implementations of this pretty easily, but I'd prefer to come up with my own solutions.
Your implementation look good. It's the classic implentation of a graph structure (a node with a list of neighbors). Some points:
You can use backtracking to deal with multiples paths that reach the same node. If the dfs method have a recursive implementation, you need to avoid the recursive call if the Node has already a parent. But if the new path is better that the old one, then you discard the old parent, and set the new one.
Your implementation is a directional graph. In other words, you can build a graph that has a path from A to B, but has no path from B to A. I don't know if this is ok for you.
I recommend you encapsulate the building of the graph inside a wrapper, that build both paths automatically, whith a unique call to a method. That way, always you build bidirectional paths.
You can use a Set to store the neighbors. That way, there is no duplicates. Of course, you need to implements the "equals" method in the Node class.
Problem is I don't understand how to create a tree. I have gone through many code examples on trees but I don't even know how to work with/handle a node and hence I don't understand how the node class works(that was present in all the program examples ). When i try to use methods such as appendChild(as mentioned in java docs),I get an error,and I am asked to create one such appendChild method inside that node class within the main program. Couldn't understand why that happened.
I am given integer pairs((u,v) meaning there is an edge between u & v) of nodes and I also need to know if any Element-to-node conversion is required for using u and v(of type integer) as nodes.
Please bear with me since my basics are weak. Little explanation on how the entire thing works/functions would be very helpful.
Thank you.
EDIT 1: I went through the following links:(hardly found anything on just unordered trees) http://www.cs.cmu.edu/~adamchik/15-121/lectures/Trees/code/BST.java http://www.newthinktank.com/2013/03/binary-tree-in-java/ . Tried to modify these codes to meet my own purpose but failed.
I only got a blurry idea and that is not enough for implementation. I am trying to make a simple unordered tree,for which i am given u v pairs like:
(4,5) (5,7) (5,6). I just need to join (4<--5),(5<--7) and (5<--6). So how do I write a node class that only joins one node to the prev node? Besides,to do only this,do I need to bother myself with leftchild,rightchild? If not,how will I be able to traverse the tree and do similar operations such as height diameter calculation etc later?
Thank you for your patience.
Well Its not entirely clear whether you want an explanation on tree creation in general, on some tree implementation you have found, or you have a basic code already that you cannot get working. You might want to clarify that :).
Also tree creation in general:
Most explanation and implementation you will find might be "overly" elegant :). So try to imagine a simple linked list first. In that list you have nodes(the elements of the list). A node contains some valuable data and a reference to an other node object. A very simple tree is different only in that a node have more than one reference to other nodes. For example it has a Set of references, its "children".
Here is a VERY crude code only as an example for addChild() or appendChild in your case:
public class Node {
private String valueableData;
private Set<Node> children;
public Node(){
this.children=new HashSet<Node>();
}
public Node (String valueableData){
this.valueableData=valueableData;
this.children=new HashSet<Node>();
}
public void addChild(Node node){
children.add(node);
}
}
Now this implementation would be quite horrible (I even deleted the setter/getters), also it would be wise to keep the root nodes reference in some cases, and so on. But you might get the basic idea.
You might wanna create a cycle or recursion to go over the (u,v) integer pairs. You create the root node first then you just addChild() all the other nodes recursively, or create every node first then setChild() them according to your rules.
I have a tree structure that consists of dozens of types of nodes (each type of node inherits from a NodeBase class).
I would like to perform searches on the tree to return a reference to a specific node. For example, suppose there is some Company tree, which contains Department nodes amongst other types of nodes. Department nodes consist of Employee nodes. It is assumed that an employee must be part of a department, and can be in exactly one department.
Currently, it is designed so that each node has a list of child nodes of type NodeBase. A tree can become quite large, with hundreds of thousands of nodes at times. Insertion/deletion operations are seldom used, while search operations should not take "too long" for these big trees.
Suppose I want to get a reference to an employee node whose employee ID field equals some string that I provide. I don't know which department the employee is in, so I'd have to perform a search through all of the nodes hoping to find a match. Not all nodes have an employee ID field; departments, for example, do not have them.
I am not sure what is the best way to implement the search functionality, given this tree structure design.
There are probably better ways to design how the data is stored in the first place (eg: using a database?) but currently I am stuck with a tree.
Data structures are the way you organize your data, and the way you organize data depends on how you actually use those pieces of information.
A tree is the right data structure to answer questions like "get all descendents of node X", but doesn't help you to solve the problem of "find me the object with the property X set to Y" (at least not your tree: you could certainly use a tree internally to keep a sorted index as I explain later).
So I think the best way to solve this is using two separate data structures to organize the data: a tree made of NodeBase objects to reflect the hierarchical relationship among NodeBase's, and a sorted index to make the searches with a decent performance. This will introduce a synchronization problem, though, because you'll have to keep the two data structures in sync when nodes are added/removed. If it doesn't happen too frequently, or simply the search performance is critical, then this may be the right way.
Assuming that your tree is DAG (directed acyclic tree), use DFS or BFS, for example. Here's a simple BFS:
public NodeBase findEmployee (NodeBase root, Integer employeeId) {
Queue<NodeBase> q= new LinkedList<NodeBase>();
q.add(root);
while (!q.isEmpty()) {
NodeBase node= q.poll();
if (node instanceof Employee) {
if (((Employee)node).getId().equals(employeeId))
return node;
}
for (NodeBase child : node.getChildren())
q.add(child);
}
}
}
EDIT: Visitor pattern
Or as Brabster suggested, you can use a visitor pattern. A NodeBase should implement an accept(IVisitor visitor) method:
public class NodeBase {
//your code
public void accept(IVisitor visitor) {
visitor.visit(this);
for (NodeBase node : getChildren()) {
node.accept(visitor);
}
}
}
IVisitor is just an intercace:
public interface IVisitor {
public void visit(NodeBase node);
}
And you need a proper implementation that will do the search:
public class SearchVisitor implements IVisitor {
private Integer searchId;
public SearchVisitor(Integer searchId) {
this.searchId= searchId;
}
#Override
public void visit(NodeBase node) {
if (node instanceof Employee) {
if (((Employee)node).getId().equals(searchId)) {
System.out.println("Found the node " + node.toString() + "!");
}
}
}
}
And now, you just simply call it:
NodeBase root= getRoot();
root.accept(new SearchVisitor(getSearchId()));
It looks like there are two parts to this question -- decomposition of class hierarchies and the implementation of the search algorithm.
In the Java world there are two possible solutions to the problem of decomposition:
Object oriented decomposition, which has a local nature, and
Type checking decomposition using instanceof and type casting.
Functional languages (including Scala) offer pattern matching, which is really a better approach to implement the type checking decomposition.
Due to the fact that there is a need to work with a data structure (tree) where elements (nodes) can be of varying types, the nature of the decomposition is definitely not local. Thus, the second approach is really the only option.
The search itself can be implemented using, for example, binary search tree algorithm. Such tree would need to be constructed out of your data, where the decision where to place a certain node should depend on the actual search criterion. Basically, this means you'd need to have as many trees as there are different search criteria, which is in essence a way to build indexes. The database engines use more sophisticated structures than the binary search tree. For example, red-black trees, but the idea is very similar.
BTW the binary search tree would have a homogeneous nature. For example, if the search pertains to Employee by Department, then the search tree would consist only of nodes associated with Employee instances. This removes the decomposition problem.
I'm creating an API that encapsulates JPA objects with additional properties and helpers. I do not want the users to access the database, because I have to provide certain querying functionality for the consumers of the API.
I have the following:
Node1(w/ attributes) -- > Edge1(w/ attr.) -- > Node2(w/ attr.)
and
Node1(w/ attributes) -- > |
Node2(w/ attributes) -- > | -- > HyperEdge1(w/ attr.)
Node3(w/ attributes) -- > |
Basically a Node can be of a certain type, which would dictate the kind of attributes available. So I need to be able to query these "paths" depending on different types and attributes.
For example: Start from a Node, and find a path typeA > typeB & attr1 > typeC.
So I need to do something simple, and be able to write the query as a string, or maybe a builder pattern style.
What I have so far, is a visitor pattern set up to traverse the Nodes/Edges/HyperEdges, and this allows for a sort of querying, but it's not very simple, since you have to create a new visitor for new types of queries.
This is my implementation so far:
ConditionImpl hasMass = ConditionFactory.createHasMass( 2.5 );
ConditionImpl noAttributes = ConditionFactory.createNoAttributes();
List<ConditionImpl> conditions = new ArrayList<ConditionImpl>();
conditions.add( hasMass );
conditions.add( noAttributes );
ConditionVisitor conditionVisitor = new ConditionVisitor( conditions );
node.accept( conditionVisitor );
List<Set<Node>> validPaths = conditionVisitor.getValidPaths();
The code above, does a query that checks if the starting node has a mass of 2.5 and a linked node (child) has no attributes. The visitor does a condition.check( Node ) and returns a boolean.
Where do I start with creating a querying language for a graph that is simpler?
Note: I do not have the option of using an existing graph library and I will have hundreds of thousands of nodes, plus the edges..
Personally, I like the idea of the visitor pattern, however it might turn out to expensive to visit all nodes.
Query Interface: If users / other developers are using it, I would use a builder style interface, with readable method names:
Visitor v = QueryBuilder
.selectNodes(ConditionFactory.hasMass(2.5))
.withChildren(ConditionFactory.noAttributes())
.buildVisitor();
node.accept(v);
List<Set<Node>> validPaths = v.getValidPaths();
As pointed out above, this is more or less just syntactic sugar for what you already have (but sugar makes all the difference). I would separate the code for "moving on the graph" (like "check whether visited node fulfills condition" or "check whether connected nodes fulfill condition") from the code that actually checks (or is) a condition. Also, use composites on conditions to build and/or:
// Select nodes with mass 2.5, follow edges with both conditions fulfilled and check that the children on these edges have no attributes.
Visitor v = QueryBuilder
.selectNodes(ConditionFactory.hasMass(2.5))
.withEdges(ConditionFactory.and(ConditionFactory.freestyle("att1 > 12"), ConditionFactory.freestyle("att2 > 23"))
.withChildren(ConditionFactory.noAttributes())
.buildVisitor();
(I used "freestyle" because of missing creativity right now, but the intention of it should be clear) Node that in general this might be two different interfaces in order to not build strange queries.
public interface QueryBuilder {
QuerySelector selectNodes(Condition c);
QuerySelector allNodes();
}
public interface QuerySelector {
QuerySelector withEdges(Condition c);
QuerySelector withChildren(Condition c);
QuerySelector withHyperChildren(Condition c);
// ...
QuerySelector and(QuerySelector... selectors);
QuerySelector or(QuerySelector... selectors);
Visitor buildVisitor();
}
Using this kind of syntactic sugar makes the queries readable from the source code without forcing you to implement your own data query language. The QuerySelector implementations would than be responsible for "moving" around the visited nodes whereas the Conditition implementation would check whether the condition match.
The clear downside of this approach is, that you need to foresee most of the queries in interfaces and need to implement them already.
Scalability with number of nodes: You might need to add some kind of index to speed up finding "interesting" nodes. One idea which is popping up is to add (for each index) a layer to the graph in which each nodes models one of the different attribute settings for the "indexed variable". The normal edges could then connect these index nodes with the nodes in the original graph. The hyper edges on the index could then build a network which is smaller to search on. Of course there is still the boring way of storing the index in a map-like structure with a attributeValue -> node mapping. Which probably is much more performant than the idea above anyway.
If you have some kind of Index make sure that the index can as well receive a visitor such that it does not have to visit all nodes in the graph.
It sounds like you have all the pieces except some syntactic sugar.
How about an immutable style where you create the whole list above like
Visitor v = Visitor.empty
.hasMass(2.5)
.edge()
.node()
.hasNoAttributes();
You can create any kind of linear query pattern using this style; and if you add a some extra state you could even do branching queries by e.g. setName("A") and later .node("A") to return to that point of the query.
In red-black tree, when rotate, you need to know who is the parent of particular node.
However, the node only has reference to either right or left child.
I was thinking to give a node instance variable "parent" but just for this reason I don't think it is worth doing so and also it would be too complicated to change parent reference per rotation.
public class Node {
private left;
private right;
private isRed;
private parent; //I don't think this is good idea
}
So, my solution is to write findParent() method that use search to find parent. I am wondering if there is any other way to find a node's parent?
My solution:
sample tree:
50
/ \
25 75
If you want to find parent of Node 25, you pass something like:
Node parent = findParent(Node25.value);
and it returns node50.
protected Node findParent(int val) {
if(root == null) {
return null;
}
Node current = root;
Node parent = current;
while(true) {
if(current.val == val) { //found
return parent;
}
if(current == null) { //not found
return null;
}
parent = current;
if(current.val > val) { //go left
current = current.left;
}
else { //go right
current = current.right;
}
}
}
The use of a parent pointer is optional. If you forgo the parent pointer then you will have to write insert/delete operations using recursion (the recursive method calls preserve the parent information on the stack) or write an iterative version which maintains its own stack of parents as it moves down the tree.
A very good description of red-black trees can be found here
http://adtinfo.org/
That includes descriptions of a number of rbtree implementations including with and without parent pointers.
If you do want to save on space (and that is fair enough) a really excellent description of an rbtree implementation can be found here
http://www.eternallyconfuzzled.com/tuts/datastructures/jsw_tut_rbtree.aspx
The method you have described for searching for a node's parent would be very inefficient if used by the insert/delete implementations. Use a pointer or use recursion.
I was thinking to give a node instance variable "parent" but just for this reason I don't think it is worth doing so
Having your nodes have a parent reference requires one extra pointer/reference per node. Compare this with needing to traverse the tree whenever you need to know the parent for a given node.
This is then a trade-off between
The cost of maintaining an extra reference, and keeping it up to date whenever you modify a node.
The computational cost and complexity of having to traverse the tree to find a parent of a given node
I think that the choice between these two options is somewhat subjective but personally I would choose to simply keep track of the parent references.
As a point of reference for you, java.util.TreeMap is implemented as a Red-Black tree which Entry nodes that contain left, right, and parent references.
As you traverse the tree to get to your pivot node you can cache the previous parent or if you need more than one level of "undo" you could cache each traversed node on to a stack.
This cache would be a variable local to your rotation algorithm so it wouldn't require any more space in the tree or expensive additional traversals.
It's definitely better to store the parent than to look it up. Updating parent reference is not that complex.
Another solution, besides parent pointers and querying the parent all over again is to maintain an ancestor stack.
Suppose someone wishes to insert 23 into the following tree:
Red Black Tree
Generally the algorithm to insert is:
Find node where 23 would be if it is in the tree
If 23 is already there, return failure
If 23 is not already there, put it there.
Run your re-balancing/coloring routine as needed.
Now, to use the stack approach, you allocate a stack big enough to support one node per level of your tree (I think 2 * Ceiling(Log2(count)) + 2) should have you covered. You could even keep a stack allocated for insertion or deletion and just clear it whenever you start an insertion.
So -- Look at the root. Push it onto the stack. 23 is greater than value in the root, so go right. Now push node current node (value 21) onto the stack. If 23 is in the tree, it must be to the right of current node. But the node to the right of the current node is a null-sentinel. Thus, that null-sentinel should be replaced with a node with your value. The parent is the item on the top of the stack (most recently pushed), the grandparent is next in line ... etc. Since you seem to be learning ... Java supplies a stack interface for you so you won't need to develop your own stack to do this. Just use theirs.
As to whether this is better than the parent pointer approach, that seems debatable to me -- I would lean to the parent pointer approach for simplicity and elimination of the need to maintain an ancillary data structure or use recursion extensively. That said, either approach is better than querying the parent of the current node as you apply your re-balancing/coloring routine.