How to generate AST in ANTLR4? - java

I'm working on a project in which I have to generate Abstract Syntax Tree for a given program. Here program can be in any mainstream programming languages. What should be the standard way of generating AST in ANTLR4? I know only basics of ANTLR4 and I'm able to generate Parse tree for a given program.

ANTLR 4 automatically generates parse trees instead of relying on manually-structured ASTs. This decision was made after observing years of development with prior approaches encountering extreme maintainability challenges, especially when multiple tree parsers were involved.
If you need an abstract representation of your source code, you should create an object model that accurately represents the constructs in your language, rather than rely on weakly typed and generally unstructured AST nodes. You then walk the parse trees instead of ASTs to create your object model.
I would not advise going with ANTLR 3 for any new project.

Related

What are textual UML class diagrams? [duplicate]

UML classdiagrams are a standard graphical notation to describe classes and their relationships.
Is there a standard textual notation (DSL) to describe the same? Don't say XMI or EMF;-)
I think you could do that with Corba IDL and use Interfaces for classes, but this is somehow too much on the Corba side. You could use Java Interfaces, but this is too Java.
Background of my question is writing generators. I think it is easier to write a generator based on the syntax tree of a DSL than to parse a graphical notation. A graphical notation first has to be translated into a syntax tree (that would be the same you'd get from the corresponding DSL). I think translating a graphical notation into the syntax tree is harder than to translate a DSL (where you can use ANTLR).
You've got the answer already, but I'd like to clarify. There is a standard notation, it's called HUTN, and nobody uses it.
Check this complete list of textual notations to describe UML models. Btw, the reasons to create one of these tools (in particular TextUML) can be find here.
It is no coincidence that UML separates abstract and concrete syntax.
Tying up code generation to a user-facing notation is a bad idea. Tools (code generators) and people (modelers) have totally distinct needs, so no single syntax can serve both audiences well. Not to mention you lose the ability of applying the same code generator to models created using different notations.
TextUML is a concrete syntax tailored to modelers. XMI is a much better notation for tools, and the UML2 object model makes it very easy to handle.
Rafael
http://abstratt.com/blog
No standard notation to my knowledge but a good summary of options here.
hth.

Constructing AST in ANTLR version4

I am developing a compiler and have already implemented lexer, parser and semantic analyzer(using listener and visitor) using ANTLR4. For code generation I am planning to generate LLVM IR using StringTemplate(ST).
To do so I am thinking of first constructing an AST and then generating the code.
My question here is do I need to construct AST?? or can I use Parse Tree?
If I need to use AST, I am not able to find any examples of manually constructing AST using visitors or listeners. Even a small grammar example will be very helpful.
Thank you.
No, there is no fundamental need to construct an AST. In the simplest case, you can walk the parse-tree and output the IR, directly or using ST.
Where transformations are required for output as IR, the two basic approaches are to (1) analyze and annotate the parse-tree describing the necessary changes; or (2) walk the parse-tree, constructing a separate AST, and then walk and transform the AST.
For the annotation strategy, extend ParseTreeProperty to create context node type specific property classes. See the comment in that class for how to use.
The AST strategy is not discouraged -- it was the primary strategy used in Antlr3 -- but is essentially unsupported in Antlr4. As for why Antlr4 favors the annotation strategy, see the last few paragraphs of this answer.

implementation of shift reduce parser in java

i need to implement the shift reduce parser in my college ,i need to know how can i implement it using java
is there is any implementations already .... or any sample one
is there any implementations already?
Unless the task is to actually practice writing it yourself, I'd recommend using a parser generator such as JavaCUP or ANTLR. (I used JavaCUP in one of my compiler courses, but perhaps you have a different scope in your course.)

Java 1.5: mathematical formula parser

Hello i often develop JTableModels in which some cells must contain the result of apliying a certain simple mathematical formula. This formulas can have:
Operators (+,-,*,/)
Number constants
Other cell references (which contains numbers)
Parameters (numbers with a reference name like "INTEREST_RATE")
I often resolve it making a little calculator class which parses the formula, which syntax i define. The calculator class uses a stack for the calcs, and the syntax uses allways a Polish notation.
But the Polish notation is unnatural for me and for my users. So my question is...
Is there a lib which runs in 1.5 jvm's and can handle my requeriments and use normal notation (with brackets, i don't know the name of this notation style) for formulas?
P.D it's supposed that the formulas are allways syntax correct and i can preprocess the numbers that are not constants to provide their values
Have you thought about the benefits of JSR-223 ? in a few words, this spec allows Java developers to integrate with great ease dynamic languages and their parsers. Using such parser, your need for defining a parser transforms into the need for defining an internal DSL, which resolves into creating simply a good API, and letting your user choose wether they prefer Javascript/Groovy/Scala/WTF syntax they happen to prefer.
Try JEP.
You can define new variables to the parser hence it can contain reference names like "INTEREST_RATE".But you have to define it before hand.
As for cell references you will have to extract the number's and edit the expression accordingly or probably there might be some options which I'm not yet aware of.
If you can't use Java 6 and its scripting support then have a look at the Apache Bean Scripting Framework (BSF). From that page:
... BSF 3.x will run on Java 1.4+, allowing access to JSR-223 scripting for Java 1.4 and Java 1.5.
i released an expression evaluator based on Dijkstra's Shunting Yard algorithm, under the terms of the Apache License 2.0:
http://projects.congrace.de/exp4j/index.html
There's a commercial tool called formula4j which may be useful to some.
It has no direct help for cell references. You would have to handle those yourself, and translate the cell references into values.

How to serialise an antlr3 AST

I have just started using antlr3 and am trying to serialize the AST output of a .g grammar.
Thanks,
Lezan
As Vladimir pointed out, you can use a a custom AST node class that has serialize capabilities builtin. You could also use a tree adaptor to create the types of nodes you need.
If you only need serialization, and not de-serialization, you could probably just do:
ast.toStringTree()
The above will give you a LISP like tree structure. An easy way to do serialization would be to use that in combination with a custom AST node class with an overridden toString(). Since toStringTree() uses the node's toStringTree method, it'll essentially serialize whatever you put in toString. Make its output sufficient and useful and you should be set.
CommonTree nodes produced by Parser are not Serializable.
I'd suggest you to serialize Tokens and use a secondary grammar for parsing the (deserialized) stream of Tokens later. In the book (The Definitive ANTLR Reference), in the Quick Tour for Impatient chapter, Terence Parr gives exactly this scenario -- without serialization though, but serialization is trivial for tokens as they are just text.
My understanding also that you can replace the Tree class with your own:
options {
ASTLabelType = MyOwnTreeClass;
}
But I haven't tried it.

Categories