ParseTreeVisitor or ParseTreeListener - java

I am trying to write an AST interpreter / REPL. ANTLRv4 provides two very similar interfaces (ParseTreeVisitor and ParseTreeListener) to walk the parse tree. I cannot seem to find any major differences between them, and the documentation is rather sparse. Is one interface preferable to the other?

The interfaces are used for different purposes. The primary differences are as follows:
ParseTreeListener
Provides separate enter/exit methods for before/after the children of a parse tree node are examined.
All methods return void. Any values collected for "return" by the listener must be held in fields or elsewhere.
Control of which tree nodes are examined is external (via ParseTreeWalker or a derived class).
ParseTreeVisitor
Provides one method which is responsible for all analysis/behavior for each parse tree node.
Each method returns generic type parameter T, which may be Void if the visitor methods do not return a value.
Control of which tree nodes are examined is internal (via visitChildren and/or calls to visit for specific children).

Related

Can we directly point to a node from Eclipse AST instead of visiting all the nodes

I am trying to parse a java file using Eclipse JDT's AST. ASTVisitor provides a nice API to traverse all the nodes and work with the node which we want. Now what I want is, can we go to a target node, let say of type MethodDeclaration or all the nodes of that type, instead of traversing all the nodes? Because this reduces time if I have to get all the nodes of a particular type in a whole package. Thanks in advance.
Finding all nodes of a given type inherently is traversing. ASTVisitor is suitable for this exact task.
If you are concerned about unnecessary traversal below the node you are interested in, just return false from the corresponding visit() method, and the visitor will not descend into children of the current node.
I'd be surprised, though, if traversing actually were a performance bottleneck. Creating the AST in the first place is more expensive than that.
If you only want to address few nodes (identified, e.g., by a name pattern), then performing a search (which relies on an index) could perhaps be faster, but this probably pays off only if a significant number of files can be skipped entirely.
Finally, as you mention MethodDeclaration: perhaps you don't even need AST but the Java Model (which is much more light weight) is sufficient for your task?

Add a Custom Tree Class in Java Language Hierarchy

I have designed a CustomTree class and programmed its operations.
Nodes are added such that a subtree will become full, before an element can be added to its sibling subtree.
Since Tree is a collection of nodes, I realized that, my CustomTree should implement a Collection Interface.
Is this correct or should my CustomTree extend a more relevant class like TreeSet?
I want to know where my class should go into if it should match Java's language heirarchy.
The question is which properties do you want your class to have? Collection properties (just a general "bag"), Set properties (no two elements are the same) and/or List properties (sequence of elements is relevant)?
Once you have answered these questions for yourself, you can select the proper base class.

What is the role of I*Binding in Eclipse JDT?

My current understanding is that JDT provides us two different interface and class hierachies for representing and manipulating Java code:
Java Model: provides a way of representing a java project. Fast to
create but does not contain as many information as AST class
hierachy, for example there are no information available about the
exact position of each element in the source file (in AST that's
available)
AST: more detailed representation of the source code plus provides
means for manipulating it.
Is that correct?
Now, there is also a hierarchy of interfaces named I*Binding (starting at IBinding), for example IMethodBinding. So for example, we have 3 different types for dealing with methods:
IMethod (from Java Model)
MethodInvocation (from AST, could get it from IMethod)
IMethodBinding
From doc IMethodBinding seems very like MethodInvocation from AST but I don't see a clear distinction and when should I use them. Could someone please clarify this?
Raw AST nodes do not contain references between them e.g. from variable use back to its declaration, or from method invocation back to method declaration. MethodInvocation object may be inspected for method name, but you can't immidiately learn what method of which class is being invoked actually. scoping analysis is required to do so.
This analysis is called binding resolution. IBinding objects are attached to AST nodes and you can use them to find e.g. a MethodDeclaration AST node for a given MethodInvocation AST node using CompilationUnit.findDeclaringNode(methodInvocationNode.resolveMethodBidning().getKey())
Or you can use CompilationUnit.findDeclaringNode(method.getKey()) to find which AST node contains declaration corresponding to given IMethod object.
MethodInvocation.resolveBinding().getKey() ==
MethodDeclaration.resolveBinding().getKey() ==
IMethod.getKey()

What should a Tree class contain?

My class is currently doing Binary Trees as part of our data structures unit. I understand that the Node class must have 3 parameters (value, left node, right node) in its constructor. As part of a requirement, we must have a Tree class. What is the purpose of a tree class? Is it for managing the entire set of Nodes? Does it simply contain the functions necessary to insert, remove and search for specific Nodes?
Thank you in advance.
My Node class:
class Node {
protected int data;
protected leftNode;
protected rightNode;
Node (int data, Node leftNode, Node rightNode){
this.data = data;
this.leftNode = leftNode;
this.rightNode = rightNode;
}
}
Yes, it is supposed to give the functional interface of a Tree by encapsulating all the behavior and algorithms related to the internal structure.
Why this is good?
Because you will define something that just provide some functionality and that works in a stand-alone way so that everyone should be able to use your tree class without caring about nodes, algorithms and whatever.
Ideally the class should be parametric so that you'll have Tree<T> and you'll be able to have generic methods for example
T getRoot()
Basically you'll have to project it so that it will allow you to
insert values
delete values
search values
visit the whole tree
whatever
As I see it, a Tree class should hold a reference to the root node of the tree, and to methods and attributes that operate on the tree as a whole, as opposed to methods that belong to each node, like getLeft(), getValue().
For example, I'd define a size attribute in the Tree class, and the methods that add or remove nodes to the tree (which also happen to be in this class) would be responsible for keeping the size up to date.
The purpose of any data structure is to give you a way to hold onto collections of related values and manipulate them in meaningful ways.
A tree is a special kind of data structure. It's a good choice for data that is hierarchical: those with natural parent-child relationships. A binary tree has, at most, two children for every parent node.
One other feature of trees that deserves special mention is the fact that it's self-similar: every Node in a tree is itself the root of a sub-tree. Recursion exploits this fact.
Yes, those are good methods to start with. You might want to have a look at the java.util.Map interface for others that might be useful.
The assumption is, as you suspect, a little bit flawed. The Node type that you have defined is a perfectly fine instance of a binary tree.
Jack mentions that you want to have a Tree type around the whole set of nodes in order to provide operations like getRoot, inserts, deletes and so on. This is of course not a bad idea, but it is by no means a requirement. Many of these operations could go on Node itself, and you don't necessarily have to have all of these operations; for example, there are arguments both for and against having a getRoot operation. An argument against it is that if you have an algorithm that at one point only needs a subtree, then the Tree object that holds on to the root Node prevents garbage collection of Nodes that your algorithm no longer needs.
But more generally, the real question that needs to be asked here is this: which of the things you're dealing are interface, and which are implementation? Because it's one thing if you want to expose binary trees as an interface to callers, and another if you want to provide some sort of finite map as the interface, and the binary tree is just an implementation detail. In the former case, the client would "know" that a tree is either null or a Node with a value and branches, and would be allowed to, for example, recurse on the structure of the tree; in the latter case, all that the client would "know" is that your class supports some form of put, get and delete operations, but it's not allowed to rely on the fact that this is stored as a search tree. In many variants of the latter case you indeed would use a Tree type as a front end to hide the nodes from the clients. (For example, Java's built-in java.util.TreeMap class does this.)
So the shortest answer is, really, it depends. The slightly longer answer is it depends on what the contract is between the binary tree implementation and its users; what things are the callers allowed to assume about the trees, what details of the binary tree are things that the author expects to be able to change in the future, etc.
"Is it for managing the entire set of Nodes?"
Yes.
"Does it simply contain the functions necessary to insert, remove and search for specific Nodes?"
Basically, yes. For that reason, it should contain at least a reference to the root node.

Existing implementations of Trees in Java?

I'm looking for any implementation of a pure Tree data structure for Java (that aren't graphical ones in java.awt), preferably generic.
With a generic tree I'd like to add elements that are not supposed to be sorted and do something like this:
TreeNode anotherNode = new TreeNode();
node.add(anotherNode);
…and then I'd like to traverse the nodes (so that I can save and preserve the structure in a file when I load the tree from the same file again).
Anyone knows what implementations exist or have any other idea to achieve this?
You can use the DefaultMutableTreeNode defined in the javax.swing.tree package. It contains methods getUserObject() and setUserObject(Object) allowing you to attach data to each tree node. It allows an arbitrary number of child nodes for each parent and provides methods for iterating over the tree in a breadth-first or depth-first fashion (breadthFirstEnumeration() / depthFirstEnumeration()).
Also note that despite being resident in the javax.swing.tree package this class does not contain any UI code; It is merely the underlying model of JTree.
Scala has a nice Tree data structure. It's a "General Balanced Tree." It's not exactly Java, but it's close, and can serve as a good model.
It is hard to believe, given how much is in the base Java libraries, but there is no good generic Tree structure.
For a start, the TreeSet and TreeMap in the runtime are red-black-tree implementations.
Assuming you don't want to store arbitrary Java objects on the nodes, you could use the W3C DOM. It even comes with its own serialization format (I forget what it's called :-).

Categories