How to implement the visitor pattern for nested function - java

I am a newbie to Antlr and I wanted the below implementation to be done using Antlr4. I am having the below-written functions.
1. FUNCTION.add(Integer a,Integer b)
2. FUNCTION.concat(String a,String b)
3. FUNCTION.mul(Integer a,Integer b)
And I am storing the functions metadata like this.
Map<String,String> map=new HashMap<>();
map.put("FUNCTION.add","Integer:Integer,Integer");
map.put("FUNCTION.concat","String:String,String");
map.put("FUNCTION.mul","Integer:Integer,Integer");
Where, Integer:Integer,Integer represents Integer is the return type and input params the function will accespts are Integer,Integer.
if the input is something like this
FUNCTION.concat(Function.substring(String,Integer,Integer),String)
or
FUNCTION.concat(Function.substring("test",1,1),String)
Using the visitor implementation I wanted to check whether the input is validate or not against the functions metadata stored in map.
Below is the lexer and parser that I'm using:
Lexer MyFunctionsLexer.g4:
lexer grammar MyFunctionsLexer;
FUNCTION: 'FUNCTION';
NAME: [A-Za-z0-9]+;
DOT: '.';
COMMA: ',';
L_BRACKET: '(';
R_BRACKET: ')';
Parser MyFunctionsParser.g4:
parser grammar MyFunctionsParser;
options {
tokenVocab=MyFunctionsLexer;
}
function : FUNCTION '.' NAME '('(function | argument (',' argument)*)')';
argument: (NAME | function);
WS : [ \t\r\n]+ -> skip;
I am using Antlr4.
Below is the implementation I'm using as per the suggested answer.
Visitor Implementation:
public class FunctionValidateVisitorImpl extends MyFunctionsParserBaseVisitor {
Map<String, String> map = new HashMap<String, String>();
public FunctionValidateVisitorImpl()
{
map.put("FUNCTION.add", "Integer:Integer,Integer");
map.put("FUNCTION.concat", "String:String,String");
map.put("FUNCTION.mul", "Integer:Integer,Integer");
map.put("FUNCTION.substring", "String:String,Integer,Integer");
}
#Override
public String visitFunctions(#NotNull MyFunctionsParser.FunctionsContext ctx) {
System.out.println("entered the visitFunctions::");
for (int i = 0; i < ctx.getChildCount(); ++i)
{
ParseTree c = ctx.getChild(i);
if (c.getText() == "<EOF>")
continue;
String top_level_result = visit(ctx.getChild(i));
System.out.println(top_level_result);
if (top_level_result == null)
{
System.out.println("Failed semantic analysis: "+ ctx.getChild(i).getText());
}
}
return null;
}
#Override
public String visitFunction( MyFunctionsParser.FunctionContext ctx) {
// Get function name and expected type information.
String name = ctx.getChild(2).getText();
String type=map.get("FUNCTION." + name);
if (type == null)
{
return null; // not declared in function table.
}
String result_type = type.split(":")[0];
String args_types = type.split(":")[1];
String[] expected_arg_type = args_types.split(",");
int j = 4;
ParseTree a = ctx.getChild(j);
if (a instanceof MyFunctionsParser.FunctionContext)
{
String v = visit(a);
if (v != result_type)
{
return null; // Handle type mismatch.
}
} else {
for (int i = j; i < ctx.getChildCount(); i += 2)
{
ParseTree parameter = ctx.getChild(i);
String v = visit(parameter);
if (v != expected_arg_type[(i - j)/2])
{
return null; // Handle type mismatch.
}
}
}
return result_type;
}
#Override
public String visitArgument(ArgumentContext ctx){
ParseTree c = ctx.getChild(0);
if (c instanceof TerminalNodeImpl)
{
// Unclear if what this is supposed to parse:
// Mutate "1" to "Integer"?
// Mutate "Integer" to "String"?
// Or what?
return c.getText();
}
else
return visit(c);
}
}
Testcalss:
public class FunctionValidate {
public static void main(String[] args) {
String input = "FUNCTION.concat(FUNCTION.substring(String,Integer,Integer),String)";
ANTLRInputStream str = new ANTLRInputStream(input);
MyFunctionsLexer lexer = new MyFunctionsLexer(str);
CommonTokenStream tokens = new CommonTokenStream(lexer);
MyFunctionsParser parser = new MyFunctionsParser(tokens);
parser.removeErrorListeners(); // remove ConsoleErrorListener
parser.addErrorListener(new VerboseListener()); // add ours
FunctionsContext tree = parser.functions();
FunctionValidateVisitorImpl visitor = new FunctionValidateVisitorImpl();
visitor.visit(tree);
}
}
Lexer:
lexer grammar MyFunctionsLexer;
FUNCTION: 'FUNCTION';
NAME: [A-Za-z0-9]+;
DOT: '.';
COMMA: ',';
L_BRACKET: '(';
R_BRACKET: ')';
WS : [ \t\r\n]+ -> skip;
Parser:
parser grammar MyFunctionsParser;
options { tokenVocab=MyFunctionsLexer; }
functions : function* EOF;
function : FUNCTION '.' NAME '(' (function | argument (',' argument)*) ')';
argument: (NAME | function);
Verbose Listener:
public class VerboseListener extends BaseErrorListener {
#Override
public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol, int line, int charPositionInLine, String msg, RecognitionException e) {
List<String> stack = ((Parser)recognizer).getRuleInvocationStack();
Collections.reverse(stack);
throw new FunctionInvalidException("line "+line+":"+charPositionInLine+" at "+ offendingSymbol+": "+msg);
}
}
Output:
It is not entering visitor implementation as it is not printing System.out.println("entered the visitFunctions::"); statement.

Below is a solution in C#. This should give you an idea of how to proceed. You should be able to easily translate the code to Java.
For ease, I implemented the code using my extension AntlrVSIX for Visual Studio 2019 with NET Core C#. It makes life easier using a full IDE that supports the building of split lexer/parser grammars, debugging, and a plug-in that is suited for editing Antlr grammars.
There are several things to note with your grammar. First, your parser grammar isn't accepted by Antlr 4.7.2. Production "WS : [ \t\r\n]+ -> skip;" is a lexer rule, it can't go in a parser grammar. It has to go into the lexer grammar (or you define a combined grammar). Second, I personally wouldn't define lexer symbols like DOT, and then use in the parser the RHS of the lexer symbol directly in the parser grammar, e.g., '.'. It's confusing, and I'm pretty sure there isn't an IDE or editor would know how to go to the definition "DOT: '.';" in the lexer grammar if you positioned your cursor on the '.' in the parser grammar. I never understood why it's allowed in Antlr, but c'est la vie. I would instead use the lexer symbol you define. Third, I would consider augmenting the parser grammar in the usual way with EOF, e.g., "functions : function* EOF". But, this is entirely up to you.
Now, on the problem statement, your example input contains an inconsistency. In the first case, "substring(String,Integer,Integer)", the input is in a meta-like description of substring(). In the second case, "substring(\"test\",1,1)", you are parsing code. The first case parses with your grammar, the second does not--there's no string literal lexer rule defined in your lexer grammar. It's unclear what you really want to parse.
Overall, I defined the visitor code over strings, i.e., each method returns a string representing the output type of the function or argument, e.g., "Integer" or "String" or null if there was an error (or you could throw an exception for static semantic errors). Then, using Visit() on each child in the parse tree node, check the resulting string if it is expected, and handle matches as you like.
One other thing to note. You can solve this problem via a visitor or listener class. The visitor class is useful for purely synthesized attributes. In this example solution, I return a string that represents the type of the function or arg up the associated parse tree, checking the value for each important child. The listener class is useful for L-attributed grammars--i.e., where you are passing attributes in a DFS-oriented manner, left to right at each node in the tree. For this example, you could use the listener class and only override the Exit() functions, but you would then need a Map/Dictionary to map a "context" into an attribute (string).
lexer grammar MyFunctionsLexer;
FUNCTION: 'FUNCTION';
NAME: [A-Za-z0-9]+;
DOT: '.';
COMMA: ',';
L_BRACKET: '(';
R_BRACKET: ')';
WS : [ \t\r\n]+ -> skip;
parser grammar MyFunctionsParser;
options { tokenVocab=MyFunctionsLexer; }
functions : function* EOF;
function : FUNCTION '.' NAME '(' (function | argument (',' argument)*) ')';
argument: (NAME | function);
using Antlr4.Runtime;
namespace AntlrConsole2
{
public class Program
{
static void Main(string[] args)
{
var input = #"FUNCTION.concat(FUNCTION.substring(String,Integer,Integer),String)";
var str = new AntlrInputStream(input);
var lexer = new MyFunctionsLexer(str);
var tokens = new CommonTokenStream(lexer);
var parser = new MyFunctionsParser(tokens);
var listener = new ErrorListener<IToken>();
parser.AddErrorListener(listener);
var tree = parser.functions();
if (listener.had_error)
{
System.Console.WriteLine("error in parse.");
}
else
{
System.Console.WriteLine("parse completed.");
}
var visitor = new Validate();
visitor.Visit(tree);
}
}
}
namespace AntlrConsole2
{
using System;
using Antlr4.Runtime.Misc;
using System.Collections.Generic;
class Validate : MyFunctionsParserBaseVisitor<string>
{
Dictionary<String, String> map = new Dictionary<String, String>();
public Validate()
{
map.Add("FUNCTION.add", "Integer:Integer,Integer");
map.Add("FUNCTION.concat", "String:String,String");
map.Add("FUNCTION.mul", "Integer:Integer,Integer");
map.Add("FUNCTION.substring", "String:String,Integer,Integer");
}
public override string VisitFunctions([NotNull] MyFunctionsParser.FunctionsContext context)
{
for (int i = 0; i < context.ChildCount; ++i)
{
var c = context.GetChild(i);
if (c.GetText() == "<EOF>")
continue;
var top_level_result = Visit(context.GetChild(i));
if (top_level_result == null)
{
System.Console.WriteLine("Failed semantic analysis: "
+ context.GetChild(i).GetText());
}
}
return null;
}
public override string VisitFunction(MyFunctionsParser.FunctionContext context)
{
// Get function name and expected type information.
var name = context.GetChild(2).GetText();
map.TryGetValue("FUNCTION." + name, out string type);
if (type == null)
{
return null; // not declared in function table.
}
string result_type = type.Split(":")[0];
string args_types = type.Split(":")[1];
string[] expected_arg_type = args_types.Split(",");
const int j = 4;
var a = context.GetChild(j);
if (a is MyFunctionsParser.FunctionContext)
{
var v = Visit(a);
if (v != result_type)
{
return null; // Handle type mismatch.
}
} else {
for (int i = j; i < context.ChildCount; i += 2)
{
var parameter = context.GetChild(i);
var v = Visit(parameter);
if (v != expected_arg_type[(i - j)/2])
{
return null; // Handle type mismatch.
}
}
}
return result_type;
}
public override string VisitArgument([NotNull] MyFunctionsParser.ArgumentContext context)
{
var c = context.GetChild(0);
if (c is Antlr4.Runtime.Tree.TerminalNodeImpl)
{
// Unclear if what this is supposed to parse:
// Mutate "1" to "Integer"?
// Mutate "Integer" to "String"?
// Or what?
return c.GetText();
}
else
return Visit(c);
}
}
}

Related

In antlr visitor pattern how to navigate from one method to another

I am a newbie to Antlr I wanted to know how to navigate from one parse the enter each method and I wanted the below implementation to be done using Antlr4. I am having the below-written functions.
Below is the github link of project. https://github.com/VIKRAMAS/AntlrNestedFunctionParser/tree/master
1. FUNCTION.add(Integer a,Integer b)
2. FUNCTION.concat(String a,String b)
3. FUNCTION.mul(Integer a,Integer b)
And I am storing the functions metadata like this.
Map<String,String> map=new HashMap<>();
map.put("FUNCTION.add","Integer:Integer,Integer");
map.put("FUNCTION.concat","String:String,String");
map.put("FUNCTION.mul","Integer:Integer,Integer");
Where, Integer:Integer,Integer represents Integer is the return type and input params the function will accespts are Integer,Integer.
if the input is something like this
FUNCTION.concat(Function.substring(String,Integer,Integer),String)
or
FUNCTION.concat(Function.substring("test",1,1),String)
Using the visitor implementation I wanted to check whether the input is validate or not against the functions metadata stored in map.
Below is the lexer and parser that I'm using:
Lexer MyFunctionsLexer.g4:
lexer grammar MyFunctionsLexer;
FUNCTION: 'FUNCTION';
NAME: [A-Za-z0-9]+;
DOT: '.';
COMMA: ',';
L_BRACKET: '(';
R_BRACKET: ')';
Parser MyFunctionsParser.g4:
parser grammar MyFunctionsParser;
options {
tokenVocab=MyFunctionsLexer;
}
function : FUNCTION '.' NAME '('(function | argument (',' argument)*)')';
argument: (NAME | function);
WS : [ \t\r\n]+ -> skip;
I am using Antlr4.
Below is the implementation I'm using as per the suggested answer.
Visitor Implementation:
public class FunctionValidateVisitorImpl extends MyFunctionsParserBaseVisitor {
Map<String, String> map = new HashMap<String, String>();
public FunctionValidateVisitorImpl()
{
map.put("FUNCTION.add", "Integer:Integer,Integer");
map.put("FUNCTION.concat", "String:String,String");
map.put("FUNCTION.mul", "Integer:Integer,Integer");
map.put("FUNCTION.substring", "String:String,Integer,Integer");
}
#Override
public String visitFunctions(#NotNull MyFunctionsParser.FunctionsContext ctx) {
System.out.println("entered the visitFunctions::");
for (int i = 0; i < ctx.getChildCount(); ++i)
{
ParseTree c = ctx.getChild(i);
if (c.getText() == "<EOF>")
continue;
String top_level_result = visit(ctx.getChild(i));
System.out.println(top_level_result);
if (top_level_result == null)
{
System.out.println("Failed semantic analysis: "+ ctx.getChild(i).getText());
}
}
return null;
}
#Override
public String visitFunction( MyFunctionsParser.FunctionContext ctx) {
// Get function name and expected type information.
String name = ctx.getChild(2).getText();
String type=map.get("FUNCTION." + name);
if (type == null)
{
return null; // not declared in function table.
}
String result_type = type.split(":")[0];
String args_types = type.split(":")[1];
String[] expected_arg_type = args_types.split(",");
int j = 4;
ParseTree a = ctx.getChild(j);
if (a instanceof MyFunctionsParser.FunctionContext)
{
String v = visit(a);
if (v != result_type)
{
return null; // Handle type mismatch.
}
} else {
for (int i = j; i < ctx.getChildCount(); i += 2)
{
ParseTree parameter = ctx.getChild(i);
String v = visit(parameter);
if (v != expected_arg_type[(i - j)/2])
{
return null; // Handle type mismatch.
}
}
}
return result_type;
}
#Override
public String visitArgument(ArgumentContext ctx){
ParseTree c = ctx.getChild(0);
if (c instanceof TerminalNodeImpl)
{
// Unclear if what this is supposed to parse:
// Mutate "1" to "Integer"?
// Mutate "Integer" to "String"?
// Or what?
return c.getText();
}
else
return visit(c);
}
}
Testcalss:
public class FunctionValidate {
public static void main(String[] args) {
String input = "FUNCTION.concat(FUNCTION.substring(String,Integer,Integer),String)";
ANTLRInputStream str = new ANTLRInputStream(input);
MyFunctionsLexer lexer = new MyFunctionsLexer(str);
CommonTokenStream tokens = new CommonTokenStream(lexer);
MyFunctionsParser parser = new MyFunctionsParser(tokens);
parser.removeErrorListeners(); // remove ConsoleErrorListener
parser.addErrorListener(new VerboseListener()); // add ours
FunctionsContext tree = parser.functions();
FunctionValidateVisitorImpl visitor = new FunctionValidateVisitorImpl();
visitor.visit(tree);
}
}
Lexer:
lexer grammar MyFunctionsLexer;
FUNCTION: 'FUNCTION';
NAME: [A-Za-z0-9]+;
DOT: '.';
COMMA: ',';
L_BRACKET: '(';
R_BRACKET: ')';
WS : [ \t\r\n]+ -> skip;
Parser:
parser grammar MyFunctionsParser;
options { tokenVocab=MyFunctionsLexer; }
functions : function* EOF;
function : FUNCTION '.' NAME '(' (function | argument (',' argument)*) ')';
argument: (NAME | function);
Verbose Listener:
public class VerboseListener extends BaseErrorListener {
#Override
public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol, int line, int charPositionInLine, String msg, RecognitionException e) {
List<String> stack = ((Parser)recognizer).getRuleInvocationStack();
Collections.reverse(stack);
throw new FunctionInvalidException("line "+line+":"+charPositionInLine+" at "+ offendingSymbol+": "+msg);
}
}
Output:
It is not entering visitor implementation as it is not printing System.out.println("entered the visitFunctions::"); statement. I am not able to walk through the child nodes by using visit method.
You have a version skew between your generated parser and the runtime. Further, you have a version skew in your generated .java files, as though you downloaded and ran two Antlr tool versions (4.4 and 4.7.2), once without the -visitor option, then again with it. The source for MyFunctionsParser.java is in AntlrNestedFunctionParser\FunctionValidator\target\generated-sources\antlr4\com\functionvalidate\validate. At the top of the file, it says
// Generated from MyFunctionsParser.g4 by ANTLR 4.4
The source for MyFunctionsParserVisitor.java is
// Generated from com\functionvalidate\validate\MyFunctionsParser.g4 by ANTLR 4.7.2
The runtime is 4.7.2, which you state in pom.xml in AntlrNestedFunctionParser\FunctionValidator. There's MyFunctionsLexer.tokens defined in at least two locations, which one you are picking up, who knows. I'm not familiar with the Antlr build rules associated with the pom.xml, but what was generated is a mess (which is why I wrote my own build rules and editor for Antlr for C#). Make sure you clean the target directory completely, generate clean fresh up-to-date .java files, and you are using the right Antlr runtime 4.7.2.

ANTLR4 throw exception and exit if any error occurs using java

I am having custom function written like this.Functionconcat(Functionadd(1,2),1).
Below is the grammer I am using.
Now I have changed the grammer it is working and checking the return type for param if function exists:
grammar FunctionGrammer;
INT
: [0-9]+
;
ID
: [a-zA-Z_] [a-zA-Z_0-9]*
;
expr : add;
string : 'Functionadd(' INT ',' INT ')' ;
add : 'Functionadd(' INT ',' INT ')' ;
concat : 'Functionconcat(' (string|INT )','INT ')';
Below is the test method:
public static void main(String[] args) {
try {
CodePointCharStream input = CharStreams.fromString("Functionconcat(Functionadd(m,2),1)");
FunctionGrammerLexer lexer = new FunctionGrammerLexer(input);
CommonTokenStream tokenStream = new CommonTokenStream(lexer);
FunctionGrammerParser parser = new FunctionGrammerParser(tokenStream);
parser.setErrorHandler(new BailErrorStrategy());
ParseTree tree = parser.concat();
FunctionCompilerImpl visitor = new FunctionCompilerImpl();
visitor.visit(tree);
} catch (ParseCancellationException e) {
e.printStackTrace();
}
}
It is throwing exception but how can I give meaningful exception like parameter/parameter datatype mismatch exception or function name mismatch exception.
visitor impl:
public class FunctionCompilerImpl extends FunctionGrammerBaseVisitor<String>{
#Override
public String visitConcat(FunctionGrammerParser.ConcatContext ctx) {
super.visitConcat(ctx);
System.out.println("==================="+ctx);
return null;
}
}
But vistor is not printing System.out.println("==================="+ctx);.

How can I detect if a user enters a string which does not follow my ANTLR grammar rules?

I am making a Computer Algebra System which will take an algebraic expression and simplify or differentiate it.
As you can see by the following code the user input is taken, but if it is a string which does not conform to my grammar rules the error,
line 1:6 mismatched input '' expecting {'(', INT, VAR}, occurs and the program continues running.
How would I catch the error and stop the program from running? Thank you in advance for any help.
Controller class:
public static void main(String[] args) throws IOException {
String userInput = "x*x*x+";
getAST(userInput);
}
public static AST getAST(String userInput) {
ParseTree tree = null;
ExpressionLexer lexer = null;
ANTLRInputStream input = new ANTLRInputStream(userInput);
try {
lexer = new ExpressionLexer(input);
}catch(Exception e) {
System.out.println("Incorrect grammar");
}
System.out.println("Lexer created");
CommonTokenStream tokens = new CommonTokenStream(lexer);
System.out.println("Tokens created");
ExpressionParser parser = new ExpressionParser(tokens);
System.out.println("Tokens parsed");
tree = parser.expr();
System.out.println("Tree created");
System.out.println(tree.toStringTree(parser)); // print LISP-style tree
Trees.inspect(tree, parser);
ParseTreeWalker walker = new ParseTreeWalker();
ExpressionListener listener = new buildAST();
walker.walk(listener, tree);
listener.printAST();
listener.extractExpression();
return new AST();
}
}
My Grammar:
grammar Expression;
#header {
package exprs;
}
#members {
// This method makes the parser stop running if it encounters
// invalid input and throw a RuntimeException.
public void reportErrorsAsExceptions() {
//removeErrorListeners();
addErrorListener(new ExceptionThrowingErrorListener());
}
private static class ExceptionThrowingErrorListener extends BaseErrorListener {
#Override
public void syntaxError(Recognizer<?, ?> recognizer,
Object offendingSymbol, int line, int charPositionInLine,
String msg, RecognitionException e) {
throw new RuntimeException(msg);
}
}
}
#rulecatch {
// ANTLR does not generate its normal rule try/catch
catch(RecognitionException e) {
throw e;
}
}
expr : left=expr op=('*'|'/'|'^') right=expr
| left=expr op=('+'|'-') right=expr
| '(' expr ')'
| atom
;
atom : INT|VAR;
INT : ('0'..'9')+ ;
VAR : ('a' .. 'z') | ('A' .. 'Z') | '_';
WS : [ \t\r\n]+ -> skip ;
A typical parse run with ANTLR4 consists of 2 stages:
A "quick'n dirty" run with SLL prediction mode that bails out on the first found syntax error.
A normal run using the LL prediction mode which tries to recover from parser errors. This second step only needs to be executed if there was an error in the first step.
The first step is kinda loose parse run which doesn't resolve certain ambiquities and hence can report an error which doesn't really exist (when resolved in LL mode). But the first step is faster and delivers so a quicker result for syntactically correct input. This (JS) code shows the setup:
this.parser.removeErrorListeners();
this.parser.addErrorListener(this.errorListener);
this.parser.errorHandler = new BailErrorStrategy();
this.parser.interpreter.setPredictionMode(PredictionMode.SLL);
try {
this.tree = this.parser.grammarSpec();
} catch (e) {
if (e instanceof ParseCancellationException) {
this.tokenStream.seek(0);
this.parser.reset();
this.parser.errorHandler = new DefaultErrorStrategy();
this.parser.interpreter.setPredictionMode(PredictionMode.LL);
this.tree = this.parser.grammarSpec();
} else {
throw e;
}
}
In order to avoid any resolve attempt for syntax errors in the first step you also have to set the BailErrorStrategy. This strategy simply throws a ParseCancellationException in case of a syntax error (similar like you do in your code). You could add your own handling in the catch clause to ask the user for correct input and respin the parse step.

Using ScriptEngine in java, How can I extract function list?

using Jsoup, I extract JavaScript part in html file. and store it as java String Object.
and I want to extract function list, variables list in js's function using javax.script.ScriptEngine
JavaScript part has several function section.
ex)
function a() {
var a_1;
var a_2
...
}
function b() {
var b_1;
var b_2;
...
}
function c() {
var c_1;
var c_2;
...
}
My Goals is right below.
List funcList
a
b
c
List varListA
a_1
a_2
...
List varListB
b_1
b_2
...
List varListC
c_1
c_2
...
How can I extract function list and variables list(or maybe values)?
I think you can do this by using javascript introspection after having loaded the javascript in the Engine - e.g. for functions:
ScriptEngine engine;
// create the engine and have it load your javascript
Bindings bind = engine.getBindings(ScriptContext.ENGINE_SCOPE);
Set<String> allAttributes = bind.keySet();
Set<String> allFunctions = new HashSet<String>();
for ( String attr : allAttributes ) {
if ( "function".equals( engine.eval("typeof " + attr) ) ) {
allFunctions.add(attr);
}
}
System.out.println(allFunctions);
I haven't found a way to extract the variables inside functions (local variables) without delving in internal mechanics (and thus unsafe to use) of the javascript scripting engine.
It is pretty tricky. ScriptEngine API seems not good for inspecting the code. So, I have such kind of pretty ugly solution with instance of and cast operators.
Bindings bindings = engine.getBindings(ScriptContext.ENGINE_SCOPE);
for (Map.Entry<String, Object> scopeEntry : bindings.entrySet()) {
Object value = scopeEntry.getValue();
String name = scopeEntry.getKey();
if (value instanceof NativeFunction) {
log.info("Function -> " + name);
NativeFunction function = NativeFunction.class.cast(value);
DebuggableScript debuggableFunction = function.getDebuggableView();
for (int i = 0; i < debuggableFunction.getParamAndVarCount(); i++) {
log.info("First level arg: " + debuggableFunction.getParamOrVarName(i));
}
} else if (value instanceof Undefined
|| value instanceof String
|| value instanceof Number) {
log.info("Global arg -> " + name);
}
}
I had similar issue. Maybe it will be helpfull for others.
I use groove as script lang. My Task was to retrive all invokable functions from the script. And then filter this functions by some criteria.
Unfortunately this approach is usefull only for groovy...
Get script engine:
public ScriptEngine getEngine() throws Exception {
if (engine == null)
engine = new ScriptEngineManager().getEngineByName(scriptType);
if (engine == null)
throw new Exception("Could not find implementation of " + scriptType);
return engine;
}
Compile and evaluate script:
public void evaluateScript(String script) throws Exception {
Bindings bindings = getEngine().getBindings(ScriptContext.ENGINE_SCOPE);
bindings.putAll(binding);
try {
if (engine instanceof Compilable)
compiledScript = ((Compilable)getEngine()).compile(script);
getEngine().eval(script);
} catch (Throwable e) {
e.printStackTrace();
}
}
Get functions from script. I did not found other ways how to get all invokable methods from script except Reflection. Yeah, i know that this approach depends on ScriptEngine implementation, but it's the only one :)
public List getInvokableList() throws ScriptException {
List list = new ArrayList();
try {
Class compiledClass = compiledScript.getClass();
Field clasz = compiledClass.getDeclaredField("clasz");
clasz.setAccessible(true);
Class scrClass = (Class)clasz.get(compiledScript);
Method[] methods = scrClass.getDeclaredMethods();
clasz.setAccessible(false);
for (int i = 0, j = methods.length; i < j; i++) {
Annotation[] annotations = methods[i].getDeclaredAnnotations();
boolean ok = false;
for (int k = 0, m = annotations.length; k < m; k++) {
ok = annotations[k] instanceof CalculatedField;
if (ok) break;
}
if (ok)
list.add(methods[i].getName());
}
} catch (NoSuchFieldException e) {
e.printStackTrace();
} catch (IllegalAccessException e) {
}
return list;
}
In my task i don't need all functions, for this i create custom annotation and use it in the script:
#Retention(RetentionPolicy.RUNTIME)
#Target(ElementType.METHOD)
public #interface CalculatedField {
}
Script example:
import com.vssk.CalculatedField;
def utilFunc(s) {
s
}
#CalculatedField
def func3() {
utilFunc('Testing func from groovy')
}
Method to invoke script function by it's name:
public Object executeFunc(String name) throws Exception {
return ((Invocable)getEngine()).invokeFunction(name);
}

Using ANTLR to identify global variable declarations in a JavaScript file

I've been using the ANTLR supplied ECMAScript grammar with the objective of identifying JavaScript global variables. An AST is produced and I'm now wondering what the based way of filtering out the global variable declarations is.
I'm interested in looking for all of the outermost "variableDeclaration" tokens in my AST; the actual how-to-do-this is eluding me though. Here's my set up code so far:
String input = "var a, b; var c;";
CharStream cs = new ANTLRStringStream(input);
JavaScriptLexer lexer = new JavaScriptLexer(cs);
CommonTokenStream tokens = new CommonTokenStream();
tokens.setTokenSource(lexer);
JavaScriptParser parser = new JavaScriptParser(tokens);
program_return programReturn = parser.program();
Being new to ANTLR can anyone offer any pointers?
I guess you're using this grammar.
Although that grammar suggests a proper AST is created, this is not the case. It uses some inline operators to exclude certain tokens from the parse-tree, but it never creates any roots for the tree, resulting in a completely flat parse tree. From this, you can't get all global vars in a reasonable way.
You'll need to adjust the grammar slightly:
Add the following under the options { ... } at the top of the grammar file:
tokens
{
VARIABLE;
FUNCTION;
}
Now replace the following rules: functionDeclaration, functionExpression and variableDeclaration with these:
functionDeclaration
: 'function' LT* Identifier LT* formalParameterList LT* functionBody
-> ^(FUNCTION Identifier formalParameterList functionBody)
;
functionExpression
: 'function' LT* Identifier? LT* formalParameterList LT* functionBody
-> ^(FUNCTION Identifier? formalParameterList functionBody)
;
variableDeclaration
: Identifier LT* initialiser?
-> ^(VARIABLE Identifier initialiser?)
;
Now a more suitable tree is generated. If you now parse the source:
var a = 1; function foo() { var b = 2; } var c = 3;
the following tree is generated:
All you now have to do is iterate over the children of the root of your tree and when you stumble upon a VARIABLE token, you know it's a "global" since all other variables will be under FUNCTION nodes.
Here's how to do that:
import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
public class Main {
public static void main(String[] args) throws Exception {
String source = "var a = 1; function foo() { var b = 2; } var c = 3;";
ANTLRStringStream in = new ANTLRStringStream(source);
JavaScriptLexer lexer = new JavaScriptLexer(in);
CommonTokenStream tokens = new CommonTokenStream(lexer);
JavaScriptParser parser = new JavaScriptParser(tokens);
JavaScriptParser.program_return returnValue = parser.program();
CommonTree tree = (CommonTree)returnValue.getTree();
for(Object o : tree.getChildren()) {
CommonTree child = (CommonTree)o;
if(child.getType() == JavaScriptParser.VARIABLE) {
System.out.println("Found a global var: "+child.getChild(0));
}
}
}
}
which produces the following output:
Found a global var: a
Found a global var: c

Categories