I am having custom function written like this.Functionconcat(Functionadd(1,2),1).
Below is the grammer I am using.
Now I have changed the grammer it is working and checking the return type for param if function exists:
grammar FunctionGrammer;
INT
: [0-9]+
;
ID
: [a-zA-Z_] [a-zA-Z_0-9]*
;
expr : add;
string : 'Functionadd(' INT ',' INT ')' ;
add : 'Functionadd(' INT ',' INT ')' ;
concat : 'Functionconcat(' (string|INT )','INT ')';
Below is the test method:
public static void main(String[] args) {
try {
CodePointCharStream input = CharStreams.fromString("Functionconcat(Functionadd(m,2),1)");
FunctionGrammerLexer lexer = new FunctionGrammerLexer(input);
CommonTokenStream tokenStream = new CommonTokenStream(lexer);
FunctionGrammerParser parser = new FunctionGrammerParser(tokenStream);
parser.setErrorHandler(new BailErrorStrategy());
ParseTree tree = parser.concat();
FunctionCompilerImpl visitor = new FunctionCompilerImpl();
visitor.visit(tree);
} catch (ParseCancellationException e) {
e.printStackTrace();
}
}
It is throwing exception but how can I give meaningful exception like parameter/parameter datatype mismatch exception or function name mismatch exception.
visitor impl:
public class FunctionCompilerImpl extends FunctionGrammerBaseVisitor<String>{
#Override
public String visitConcat(FunctionGrammerParser.ConcatContext ctx) {
super.visitConcat(ctx);
System.out.println("==================="+ctx);
return null;
}
}
But vistor is not printing System.out.println("==================="+ctx);.
Related
I'm trying to design a simple query language as following
grammar FilterExpression;
// Lexer rules
AND : 'AND' ;
OR : 'OR' ;
NOT : 'NOT';
GT : '>' ;
GE : '>=' ;
LT : '<' ;
LE : '<=' ;
EQ : '=' ;
DECIMAL : '-'?[0-9]+('.'[0-9]+)? ;
KEY : ~[ \t\r\n\\"~=<>:(),]+ ;
QUOTED_WORD: ["] ('\\"' | ~["])* ["] ;
NEWLINE : '\r'? '\n';
WS : [ \t\r\n]+ -> skip ;
StringFilter : KEY ':' QUOTED_WORD;
NumericalFilter : KEY (GT | GE | LT | LE | EQ) DECIMAL;
condition : StringFilter # stringCondition
| NumericalFilter # numericalCondition
| StringFilter op=(AND|OR) StringFilter # combinedStringCondition
| NumericalFilter op=(AND|OR) NumericalFilter # combinedNumericalCondition
| condition AND condition # combinedCondition
| '(' condition ')' # parens
;
I added a few tests and would like to verify if they work as expected. To my surprise, some cases which should be clearly wrong passed
For instance when I type
(brand:"apple" AND t>3) 1>3
where the 1>3 is deliberately put as an error. However it seems Antlr is still happily generating a tree which looks like:
Is it because my grammar has some problems I didn't realize?
I also tried in IntelliJ plugin (because I thought grun might not behaving as expected) but it give
Test code I'm using. Note I also tried to use BailErrorStrategy but these doesn't seem to help
public class ParserTest {
private class BailLexer extends FilterExpressionLexer {
public BailLexer(CharStream input) {
super(input);
}
public void recover(LexerNoViableAltException e) {
throw new RuntimeException(e);
}
}
private FilterExpressionParser createParser(String filterString) {
//FilterExpressionLexer lexer = new FilterExpressionLexer(CharStreams.fromString(filterString));
FilterExpressionLexer lexer = new BailLexer(CharStreams.fromString(filterString));
CommonTokenStream tokens = new CommonTokenStream(lexer);
FilterExpressionParser parser = new FilterExpressionParser(tokens);
parser.setErrorHandler(new BailErrorStrategy());
parser.addErrorListener(new ANTLRErrorListener() {
#Override
public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol, int line, int charPositionInLine, String msg, RecognitionException e) {
System.out.print("here1");
}
#Override
public void reportAmbiguity(Parser recognizer, DFA dfa, int startIndex, int stopIndex, boolean exact, BitSet ambigAlts, ATNConfigSet configs) {
System.out.print("here2");
}
#Override
public void reportAttemptingFullContext(Parser recognizer, DFA dfa, int startIndex, int stopIndex, BitSet conflictingAlts, ATNConfigSet configs) {
System.out.print("here3");
}
#Override
public void reportContextSensitivity(Parser recognizer, DFA dfa, int startIndex, int stopIndex, int prediction, ATNConfigSet configs) {
System.out.print("here4");
}
});
return parser;
}
#Test
public void test() {
FilterExpressionParser parser = createParser("(brand:\"apple\" AND t>3) 1>3");
parser.condition();
}
}
Looks like I found the answer finally.
The reason is in the grammar I didn't provide an EOF. And obviously in ANTLR it's perfectly fine to parse the prefix os syntax. that's why the rest of the test string
(brand:"apple" AND t>3) 1>3 i.e. 1>3 is allowed.
See discussion here: https://github.com/antlr/antlr4/issues/351
Then I changed the grammar a little to add an EOF at the end of the syntax condition EOF everything works
I am a newbie to Antlr I wanted to know how to navigate from one parse the enter each method and I wanted the below implementation to be done using Antlr4. I am having the below-written functions.
Below is the github link of project. https://github.com/VIKRAMAS/AntlrNestedFunctionParser/tree/master
1. FUNCTION.add(Integer a,Integer b)
2. FUNCTION.concat(String a,String b)
3. FUNCTION.mul(Integer a,Integer b)
And I am storing the functions metadata like this.
Map<String,String> map=new HashMap<>();
map.put("FUNCTION.add","Integer:Integer,Integer");
map.put("FUNCTION.concat","String:String,String");
map.put("FUNCTION.mul","Integer:Integer,Integer");
Where, Integer:Integer,Integer represents Integer is the return type and input params the function will accespts are Integer,Integer.
if the input is something like this
FUNCTION.concat(Function.substring(String,Integer,Integer),String)
or
FUNCTION.concat(Function.substring("test",1,1),String)
Using the visitor implementation I wanted to check whether the input is validate or not against the functions metadata stored in map.
Below is the lexer and parser that I'm using:
Lexer MyFunctionsLexer.g4:
lexer grammar MyFunctionsLexer;
FUNCTION: 'FUNCTION';
NAME: [A-Za-z0-9]+;
DOT: '.';
COMMA: ',';
L_BRACKET: '(';
R_BRACKET: ')';
Parser MyFunctionsParser.g4:
parser grammar MyFunctionsParser;
options {
tokenVocab=MyFunctionsLexer;
}
function : FUNCTION '.' NAME '('(function | argument (',' argument)*)')';
argument: (NAME | function);
WS : [ \t\r\n]+ -> skip;
I am using Antlr4.
Below is the implementation I'm using as per the suggested answer.
Visitor Implementation:
public class FunctionValidateVisitorImpl extends MyFunctionsParserBaseVisitor {
Map<String, String> map = new HashMap<String, String>();
public FunctionValidateVisitorImpl()
{
map.put("FUNCTION.add", "Integer:Integer,Integer");
map.put("FUNCTION.concat", "String:String,String");
map.put("FUNCTION.mul", "Integer:Integer,Integer");
map.put("FUNCTION.substring", "String:String,Integer,Integer");
}
#Override
public String visitFunctions(#NotNull MyFunctionsParser.FunctionsContext ctx) {
System.out.println("entered the visitFunctions::");
for (int i = 0; i < ctx.getChildCount(); ++i)
{
ParseTree c = ctx.getChild(i);
if (c.getText() == "<EOF>")
continue;
String top_level_result = visit(ctx.getChild(i));
System.out.println(top_level_result);
if (top_level_result == null)
{
System.out.println("Failed semantic analysis: "+ ctx.getChild(i).getText());
}
}
return null;
}
#Override
public String visitFunction( MyFunctionsParser.FunctionContext ctx) {
// Get function name and expected type information.
String name = ctx.getChild(2).getText();
String type=map.get("FUNCTION." + name);
if (type == null)
{
return null; // not declared in function table.
}
String result_type = type.split(":")[0];
String args_types = type.split(":")[1];
String[] expected_arg_type = args_types.split(",");
int j = 4;
ParseTree a = ctx.getChild(j);
if (a instanceof MyFunctionsParser.FunctionContext)
{
String v = visit(a);
if (v != result_type)
{
return null; // Handle type mismatch.
}
} else {
for (int i = j; i < ctx.getChildCount(); i += 2)
{
ParseTree parameter = ctx.getChild(i);
String v = visit(parameter);
if (v != expected_arg_type[(i - j)/2])
{
return null; // Handle type mismatch.
}
}
}
return result_type;
}
#Override
public String visitArgument(ArgumentContext ctx){
ParseTree c = ctx.getChild(0);
if (c instanceof TerminalNodeImpl)
{
// Unclear if what this is supposed to parse:
// Mutate "1" to "Integer"?
// Mutate "Integer" to "String"?
// Or what?
return c.getText();
}
else
return visit(c);
}
}
Testcalss:
public class FunctionValidate {
public static void main(String[] args) {
String input = "FUNCTION.concat(FUNCTION.substring(String,Integer,Integer),String)";
ANTLRInputStream str = new ANTLRInputStream(input);
MyFunctionsLexer lexer = new MyFunctionsLexer(str);
CommonTokenStream tokens = new CommonTokenStream(lexer);
MyFunctionsParser parser = new MyFunctionsParser(tokens);
parser.removeErrorListeners(); // remove ConsoleErrorListener
parser.addErrorListener(new VerboseListener()); // add ours
FunctionsContext tree = parser.functions();
FunctionValidateVisitorImpl visitor = new FunctionValidateVisitorImpl();
visitor.visit(tree);
}
}
Lexer:
lexer grammar MyFunctionsLexer;
FUNCTION: 'FUNCTION';
NAME: [A-Za-z0-9]+;
DOT: '.';
COMMA: ',';
L_BRACKET: '(';
R_BRACKET: ')';
WS : [ \t\r\n]+ -> skip;
Parser:
parser grammar MyFunctionsParser;
options { tokenVocab=MyFunctionsLexer; }
functions : function* EOF;
function : FUNCTION '.' NAME '(' (function | argument (',' argument)*) ')';
argument: (NAME | function);
Verbose Listener:
public class VerboseListener extends BaseErrorListener {
#Override
public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol, int line, int charPositionInLine, String msg, RecognitionException e) {
List<String> stack = ((Parser)recognizer).getRuleInvocationStack();
Collections.reverse(stack);
throw new FunctionInvalidException("line "+line+":"+charPositionInLine+" at "+ offendingSymbol+": "+msg);
}
}
Output:
It is not entering visitor implementation as it is not printing System.out.println("entered the visitFunctions::"); statement. I am not able to walk through the child nodes by using visit method.
You have a version skew between your generated parser and the runtime. Further, you have a version skew in your generated .java files, as though you downloaded and ran two Antlr tool versions (4.4 and 4.7.2), once without the -visitor option, then again with it. The source for MyFunctionsParser.java is in AntlrNestedFunctionParser\FunctionValidator\target\generated-sources\antlr4\com\functionvalidate\validate. At the top of the file, it says
// Generated from MyFunctionsParser.g4 by ANTLR 4.4
The source for MyFunctionsParserVisitor.java is
// Generated from com\functionvalidate\validate\MyFunctionsParser.g4 by ANTLR 4.7.2
The runtime is 4.7.2, which you state in pom.xml in AntlrNestedFunctionParser\FunctionValidator. There's MyFunctionsLexer.tokens defined in at least two locations, which one you are picking up, who knows. I'm not familiar with the Antlr build rules associated with the pom.xml, but what was generated is a mess (which is why I wrote my own build rules and editor for Antlr for C#). Make sure you clean the target directory completely, generate clean fresh up-to-date .java files, and you are using the right Antlr runtime 4.7.2.
I am a newbie to Antlr and I wanted the below implementation to be done using Antlr4. I am having the below-written functions.
1. FUNCTION.add(Integer a,Integer b)
2. FUNCTION.concat(String a,String b)
3. FUNCTION.mul(Integer a,Integer b)
And I am storing the functions metadata like this.
Map<String,String> map=new HashMap<>();
map.put("FUNCTION.add","Integer:Integer,Integer");
map.put("FUNCTION.concat","String:String,String");
map.put("FUNCTION.mul","Integer:Integer,Integer");
Where, Integer:Integer,Integer represents Integer is the return type and input params the function will accespts are Integer,Integer.
if the input is something like this
FUNCTION.concat(Function.substring(String,Integer,Integer),String)
or
FUNCTION.concat(Function.substring("test",1,1),String)
Using the visitor implementation I wanted to check whether the input is validate or not against the functions metadata stored in map.
Below is the lexer and parser that I'm using:
Lexer MyFunctionsLexer.g4:
lexer grammar MyFunctionsLexer;
FUNCTION: 'FUNCTION';
NAME: [A-Za-z0-9]+;
DOT: '.';
COMMA: ',';
L_BRACKET: '(';
R_BRACKET: ')';
Parser MyFunctionsParser.g4:
parser grammar MyFunctionsParser;
options {
tokenVocab=MyFunctionsLexer;
}
function : FUNCTION '.' NAME '('(function | argument (',' argument)*)')';
argument: (NAME | function);
WS : [ \t\r\n]+ -> skip;
I am using Antlr4.
Below is the implementation I'm using as per the suggested answer.
Visitor Implementation:
public class FunctionValidateVisitorImpl extends MyFunctionsParserBaseVisitor {
Map<String, String> map = new HashMap<String, String>();
public FunctionValidateVisitorImpl()
{
map.put("FUNCTION.add", "Integer:Integer,Integer");
map.put("FUNCTION.concat", "String:String,String");
map.put("FUNCTION.mul", "Integer:Integer,Integer");
map.put("FUNCTION.substring", "String:String,Integer,Integer");
}
#Override
public String visitFunctions(#NotNull MyFunctionsParser.FunctionsContext ctx) {
System.out.println("entered the visitFunctions::");
for (int i = 0; i < ctx.getChildCount(); ++i)
{
ParseTree c = ctx.getChild(i);
if (c.getText() == "<EOF>")
continue;
String top_level_result = visit(ctx.getChild(i));
System.out.println(top_level_result);
if (top_level_result == null)
{
System.out.println("Failed semantic analysis: "+ ctx.getChild(i).getText());
}
}
return null;
}
#Override
public String visitFunction( MyFunctionsParser.FunctionContext ctx) {
// Get function name and expected type information.
String name = ctx.getChild(2).getText();
String type=map.get("FUNCTION." + name);
if (type == null)
{
return null; // not declared in function table.
}
String result_type = type.split(":")[0];
String args_types = type.split(":")[1];
String[] expected_arg_type = args_types.split(",");
int j = 4;
ParseTree a = ctx.getChild(j);
if (a instanceof MyFunctionsParser.FunctionContext)
{
String v = visit(a);
if (v != result_type)
{
return null; // Handle type mismatch.
}
} else {
for (int i = j; i < ctx.getChildCount(); i += 2)
{
ParseTree parameter = ctx.getChild(i);
String v = visit(parameter);
if (v != expected_arg_type[(i - j)/2])
{
return null; // Handle type mismatch.
}
}
}
return result_type;
}
#Override
public String visitArgument(ArgumentContext ctx){
ParseTree c = ctx.getChild(0);
if (c instanceof TerminalNodeImpl)
{
// Unclear if what this is supposed to parse:
// Mutate "1" to "Integer"?
// Mutate "Integer" to "String"?
// Or what?
return c.getText();
}
else
return visit(c);
}
}
Testcalss:
public class FunctionValidate {
public static void main(String[] args) {
String input = "FUNCTION.concat(FUNCTION.substring(String,Integer,Integer),String)";
ANTLRInputStream str = new ANTLRInputStream(input);
MyFunctionsLexer lexer = new MyFunctionsLexer(str);
CommonTokenStream tokens = new CommonTokenStream(lexer);
MyFunctionsParser parser = new MyFunctionsParser(tokens);
parser.removeErrorListeners(); // remove ConsoleErrorListener
parser.addErrorListener(new VerboseListener()); // add ours
FunctionsContext tree = parser.functions();
FunctionValidateVisitorImpl visitor = new FunctionValidateVisitorImpl();
visitor.visit(tree);
}
}
Lexer:
lexer grammar MyFunctionsLexer;
FUNCTION: 'FUNCTION';
NAME: [A-Za-z0-9]+;
DOT: '.';
COMMA: ',';
L_BRACKET: '(';
R_BRACKET: ')';
WS : [ \t\r\n]+ -> skip;
Parser:
parser grammar MyFunctionsParser;
options { tokenVocab=MyFunctionsLexer; }
functions : function* EOF;
function : FUNCTION '.' NAME '(' (function | argument (',' argument)*) ')';
argument: (NAME | function);
Verbose Listener:
public class VerboseListener extends BaseErrorListener {
#Override
public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol, int line, int charPositionInLine, String msg, RecognitionException e) {
List<String> stack = ((Parser)recognizer).getRuleInvocationStack();
Collections.reverse(stack);
throw new FunctionInvalidException("line "+line+":"+charPositionInLine+" at "+ offendingSymbol+": "+msg);
}
}
Output:
It is not entering visitor implementation as it is not printing System.out.println("entered the visitFunctions::"); statement.
Below is a solution in C#. This should give you an idea of how to proceed. You should be able to easily translate the code to Java.
For ease, I implemented the code using my extension AntlrVSIX for Visual Studio 2019 with NET Core C#. It makes life easier using a full IDE that supports the building of split lexer/parser grammars, debugging, and a plug-in that is suited for editing Antlr grammars.
There are several things to note with your grammar. First, your parser grammar isn't accepted by Antlr 4.7.2. Production "WS : [ \t\r\n]+ -> skip;" is a lexer rule, it can't go in a parser grammar. It has to go into the lexer grammar (or you define a combined grammar). Second, I personally wouldn't define lexer symbols like DOT, and then use in the parser the RHS of the lexer symbol directly in the parser grammar, e.g., '.'. It's confusing, and I'm pretty sure there isn't an IDE or editor would know how to go to the definition "DOT: '.';" in the lexer grammar if you positioned your cursor on the '.' in the parser grammar. I never understood why it's allowed in Antlr, but c'est la vie. I would instead use the lexer symbol you define. Third, I would consider augmenting the parser grammar in the usual way with EOF, e.g., "functions : function* EOF". But, this is entirely up to you.
Now, on the problem statement, your example input contains an inconsistency. In the first case, "substring(String,Integer,Integer)", the input is in a meta-like description of substring(). In the second case, "substring(\"test\",1,1)", you are parsing code. The first case parses with your grammar, the second does not--there's no string literal lexer rule defined in your lexer grammar. It's unclear what you really want to parse.
Overall, I defined the visitor code over strings, i.e., each method returns a string representing the output type of the function or argument, e.g., "Integer" or "String" or null if there was an error (or you could throw an exception for static semantic errors). Then, using Visit() on each child in the parse tree node, check the resulting string if it is expected, and handle matches as you like.
One other thing to note. You can solve this problem via a visitor or listener class. The visitor class is useful for purely synthesized attributes. In this example solution, I return a string that represents the type of the function or arg up the associated parse tree, checking the value for each important child. The listener class is useful for L-attributed grammars--i.e., where you are passing attributes in a DFS-oriented manner, left to right at each node in the tree. For this example, you could use the listener class and only override the Exit() functions, but you would then need a Map/Dictionary to map a "context" into an attribute (string).
lexer grammar MyFunctionsLexer;
FUNCTION: 'FUNCTION';
NAME: [A-Za-z0-9]+;
DOT: '.';
COMMA: ',';
L_BRACKET: '(';
R_BRACKET: ')';
WS : [ \t\r\n]+ -> skip;
parser grammar MyFunctionsParser;
options { tokenVocab=MyFunctionsLexer; }
functions : function* EOF;
function : FUNCTION '.' NAME '(' (function | argument (',' argument)*) ')';
argument: (NAME | function);
using Antlr4.Runtime;
namespace AntlrConsole2
{
public class Program
{
static void Main(string[] args)
{
var input = #"FUNCTION.concat(FUNCTION.substring(String,Integer,Integer),String)";
var str = new AntlrInputStream(input);
var lexer = new MyFunctionsLexer(str);
var tokens = new CommonTokenStream(lexer);
var parser = new MyFunctionsParser(tokens);
var listener = new ErrorListener<IToken>();
parser.AddErrorListener(listener);
var tree = parser.functions();
if (listener.had_error)
{
System.Console.WriteLine("error in parse.");
}
else
{
System.Console.WriteLine("parse completed.");
}
var visitor = new Validate();
visitor.Visit(tree);
}
}
}
namespace AntlrConsole2
{
using System;
using Antlr4.Runtime.Misc;
using System.Collections.Generic;
class Validate : MyFunctionsParserBaseVisitor<string>
{
Dictionary<String, String> map = new Dictionary<String, String>();
public Validate()
{
map.Add("FUNCTION.add", "Integer:Integer,Integer");
map.Add("FUNCTION.concat", "String:String,String");
map.Add("FUNCTION.mul", "Integer:Integer,Integer");
map.Add("FUNCTION.substring", "String:String,Integer,Integer");
}
public override string VisitFunctions([NotNull] MyFunctionsParser.FunctionsContext context)
{
for (int i = 0; i < context.ChildCount; ++i)
{
var c = context.GetChild(i);
if (c.GetText() == "<EOF>")
continue;
var top_level_result = Visit(context.GetChild(i));
if (top_level_result == null)
{
System.Console.WriteLine("Failed semantic analysis: "
+ context.GetChild(i).GetText());
}
}
return null;
}
public override string VisitFunction(MyFunctionsParser.FunctionContext context)
{
// Get function name and expected type information.
var name = context.GetChild(2).GetText();
map.TryGetValue("FUNCTION." + name, out string type);
if (type == null)
{
return null; // not declared in function table.
}
string result_type = type.Split(":")[0];
string args_types = type.Split(":")[1];
string[] expected_arg_type = args_types.Split(",");
const int j = 4;
var a = context.GetChild(j);
if (a is MyFunctionsParser.FunctionContext)
{
var v = Visit(a);
if (v != result_type)
{
return null; // Handle type mismatch.
}
} else {
for (int i = j; i < context.ChildCount; i += 2)
{
var parameter = context.GetChild(i);
var v = Visit(parameter);
if (v != expected_arg_type[(i - j)/2])
{
return null; // Handle type mismatch.
}
}
}
return result_type;
}
public override string VisitArgument([NotNull] MyFunctionsParser.ArgumentContext context)
{
var c = context.GetChild(0);
if (c is Antlr4.Runtime.Tree.TerminalNodeImpl)
{
// Unclear if what this is supposed to parse:
// Mutate "1" to "Integer"?
// Mutate "Integer" to "String"?
// Or what?
return c.GetText();
}
else
return Visit(c);
}
}
}
I am making a Computer Algebra System which will take an algebraic expression and simplify or differentiate it.
As you can see by the following code the user input is taken, but if it is a string which does not conform to my grammar rules the error,
line 1:6 mismatched input '' expecting {'(', INT, VAR}, occurs and the program continues running.
How would I catch the error and stop the program from running? Thank you in advance for any help.
Controller class:
public static void main(String[] args) throws IOException {
String userInput = "x*x*x+";
getAST(userInput);
}
public static AST getAST(String userInput) {
ParseTree tree = null;
ExpressionLexer lexer = null;
ANTLRInputStream input = new ANTLRInputStream(userInput);
try {
lexer = new ExpressionLexer(input);
}catch(Exception e) {
System.out.println("Incorrect grammar");
}
System.out.println("Lexer created");
CommonTokenStream tokens = new CommonTokenStream(lexer);
System.out.println("Tokens created");
ExpressionParser parser = new ExpressionParser(tokens);
System.out.println("Tokens parsed");
tree = parser.expr();
System.out.println("Tree created");
System.out.println(tree.toStringTree(parser)); // print LISP-style tree
Trees.inspect(tree, parser);
ParseTreeWalker walker = new ParseTreeWalker();
ExpressionListener listener = new buildAST();
walker.walk(listener, tree);
listener.printAST();
listener.extractExpression();
return new AST();
}
}
My Grammar:
grammar Expression;
#header {
package exprs;
}
#members {
// This method makes the parser stop running if it encounters
// invalid input and throw a RuntimeException.
public void reportErrorsAsExceptions() {
//removeErrorListeners();
addErrorListener(new ExceptionThrowingErrorListener());
}
private static class ExceptionThrowingErrorListener extends BaseErrorListener {
#Override
public void syntaxError(Recognizer<?, ?> recognizer,
Object offendingSymbol, int line, int charPositionInLine,
String msg, RecognitionException e) {
throw new RuntimeException(msg);
}
}
}
#rulecatch {
// ANTLR does not generate its normal rule try/catch
catch(RecognitionException e) {
throw e;
}
}
expr : left=expr op=('*'|'/'|'^') right=expr
| left=expr op=('+'|'-') right=expr
| '(' expr ')'
| atom
;
atom : INT|VAR;
INT : ('0'..'9')+ ;
VAR : ('a' .. 'z') | ('A' .. 'Z') | '_';
WS : [ \t\r\n]+ -> skip ;
A typical parse run with ANTLR4 consists of 2 stages:
A "quick'n dirty" run with SLL prediction mode that bails out on the first found syntax error.
A normal run using the LL prediction mode which tries to recover from parser errors. This second step only needs to be executed if there was an error in the first step.
The first step is kinda loose parse run which doesn't resolve certain ambiquities and hence can report an error which doesn't really exist (when resolved in LL mode). But the first step is faster and delivers so a quicker result for syntactically correct input. This (JS) code shows the setup:
this.parser.removeErrorListeners();
this.parser.addErrorListener(this.errorListener);
this.parser.errorHandler = new BailErrorStrategy();
this.parser.interpreter.setPredictionMode(PredictionMode.SLL);
try {
this.tree = this.parser.grammarSpec();
} catch (e) {
if (e instanceof ParseCancellationException) {
this.tokenStream.seek(0);
this.parser.reset();
this.parser.errorHandler = new DefaultErrorStrategy();
this.parser.interpreter.setPredictionMode(PredictionMode.LL);
this.tree = this.parser.grammarSpec();
} else {
throw e;
}
}
In order to avoid any resolve attempt for syntax errors in the first step you also have to set the BailErrorStrategy. This strategy simply throws a ParseCancellationException in case of a syntax error (similar like you do in your code). You could add your own handling in the catch clause to ask the user for correct input and respin the parse step.
I created adder.jj file following this tutorial (till page 13, just before it starts with the calculator example), to create an adder, which works great for obtaining the result of numbers and plus sign in a syntactically correct way (e.g. "4+3 +7" returns 14, while "4++3" gives an error), those numbers and + signs come from a text file (this is explained in a bit).
The code I use to generate the needed classes to do what is explained before.
options
{
STATIC = false ;
}
PARSER_BEGIN(Adder)
class Adder
{
public static void main (String[] args)
throws ParseException, TokenMgrError, NumberFormatException
{
Adder parser = new Adder (System.in) ;
int val = parser.Start() ;
System.out.println(val) ;
}
}
PARSER_END(Adder)
SKIP : { " " }
SKIP : { "\n" | "\r" | "\r\n" }
TOKEN : { < PLUS :"+"> }
TOKEN : { < NUMBER : (["0"-"9"])+ > }
int Start() throws NumberFormatException :
{
int i ;
int value ;
}
{
value = Primary()
(
<PLUS>
i = Primary()
{ value += i ; }
)*
{ return value ; }
}
int Primary() throws NumberFormatException :
{
Token t ;
}
{
t=<NUMBER>
{ return Integer.parseInt( t.image ) ; }
}
The classes are generated with
javacc adder.jj
Then I compile the generated classes with
javac *.java
And finally
java Adder < ex1.txt
Gives the right output if the content of ex1.txt has the format I explained before.
How can I change this code to receive a String so I can actually use it in my project instead of the stream from the command line?
Try replacing
Adder parser = new Adder (System.in) ;
with
Reader reader = new StringReader( someString ) ;
Adder parser = new Adder( reader ) ;