I'm trying to design a simple query language as following
grammar FilterExpression;
// Lexer rules
AND : 'AND' ;
OR : 'OR' ;
NOT : 'NOT';
GT : '>' ;
GE : '>=' ;
LT : '<' ;
LE : '<=' ;
EQ : '=' ;
DECIMAL : '-'?[0-9]+('.'[0-9]+)? ;
KEY : ~[ \t\r\n\\"~=<>:(),]+ ;
QUOTED_WORD: ["] ('\\"' | ~["])* ["] ;
NEWLINE : '\r'? '\n';
WS : [ \t\r\n]+ -> skip ;
StringFilter : KEY ':' QUOTED_WORD;
NumericalFilter : KEY (GT | GE | LT | LE | EQ) DECIMAL;
condition : StringFilter # stringCondition
| NumericalFilter # numericalCondition
| StringFilter op=(AND|OR) StringFilter # combinedStringCondition
| NumericalFilter op=(AND|OR) NumericalFilter # combinedNumericalCondition
| condition AND condition # combinedCondition
| '(' condition ')' # parens
;
I added a few tests and would like to verify if they work as expected. To my surprise, some cases which should be clearly wrong passed
For instance when I type
(brand:"apple" AND t>3) 1>3
where the 1>3 is deliberately put as an error. However it seems Antlr is still happily generating a tree which looks like:
Is it because my grammar has some problems I didn't realize?
I also tried in IntelliJ plugin (because I thought grun might not behaving as expected) but it give
Test code I'm using. Note I also tried to use BailErrorStrategy but these doesn't seem to help
public class ParserTest {
private class BailLexer extends FilterExpressionLexer {
public BailLexer(CharStream input) {
super(input);
}
public void recover(LexerNoViableAltException e) {
throw new RuntimeException(e);
}
}
private FilterExpressionParser createParser(String filterString) {
//FilterExpressionLexer lexer = new FilterExpressionLexer(CharStreams.fromString(filterString));
FilterExpressionLexer lexer = new BailLexer(CharStreams.fromString(filterString));
CommonTokenStream tokens = new CommonTokenStream(lexer);
FilterExpressionParser parser = new FilterExpressionParser(tokens);
parser.setErrorHandler(new BailErrorStrategy());
parser.addErrorListener(new ANTLRErrorListener() {
#Override
public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol, int line, int charPositionInLine, String msg, RecognitionException e) {
System.out.print("here1");
}
#Override
public void reportAmbiguity(Parser recognizer, DFA dfa, int startIndex, int stopIndex, boolean exact, BitSet ambigAlts, ATNConfigSet configs) {
System.out.print("here2");
}
#Override
public void reportAttemptingFullContext(Parser recognizer, DFA dfa, int startIndex, int stopIndex, BitSet conflictingAlts, ATNConfigSet configs) {
System.out.print("here3");
}
#Override
public void reportContextSensitivity(Parser recognizer, DFA dfa, int startIndex, int stopIndex, int prediction, ATNConfigSet configs) {
System.out.print("here4");
}
});
return parser;
}
#Test
public void test() {
FilterExpressionParser parser = createParser("(brand:\"apple\" AND t>3) 1>3");
parser.condition();
}
}
Looks like I found the answer finally.
The reason is in the grammar I didn't provide an EOF. And obviously in ANTLR it's perfectly fine to parse the prefix os syntax. that's why the rest of the test string
(brand:"apple" AND t>3) 1>3 i.e. 1>3 is allowed.
See discussion here: https://github.com/antlr/antlr4/issues/351
Then I changed the grammar a little to add an EOF at the end of the syntax condition EOF everything works
I'm working on a parser and require custom errors to be thrown for every keyword. My code is the following.
SKIP: { " " | "\t" | "\n" | "\r" }
TOKEN: { "DEF" | "MAIN" | <NAME: (["A"-"Z"])+> | <PARAM: (["a"-"z"])+> | <NUM: (["0"-"9"])+> }
void Start(): {} {(Def() Func())+ <EOF>}
void Def(): {} {"DEF" | { throw new ParseException("expected keyword DEF"); }}
void Func(): {} {"MAIN" | Name() Param() | { throw new ParseException("Expected MAIN or NAME PARAM"); }}
void Name(): {} {<NAME> | { throw new ParseException("invalid function name"); }}
void Param(): {} { <PARAM> | { throw new ParseException("invalid PARAM"); }}
The Start() function is giving me an error and tells me that Expansion within "(...)+" can be matched by empty string error. I think the problem is in the Name() Param() part of Func() but I do not know how to change this while still throwing custom error messages. Can anyone provide some pointers?
While I agree with the comment from user207421, you could maybe do the following
void oneOrMoreThings() : {} {
(Thing() | (throw new ParseException( ... ) ; }
( Thing() )*
}
Make DEF optional and then check it has been found and if not raise the exception.
Start(): {Token tk=null;} {tk="DEF"? {if (tk==null) throw ...} "MAIN" etc
I'm making a java to python translator with the help of the flex and bison tools. The bison rules refer to a restriction of java grammar. In addition to creating the rules in bison, I also created the Abstract Syntax Tree as an Intermediate Representation. The respective nodes of the AST were created in the semantic actions alongside the bison rules.
My problem concerns the management of lists of elements (or recursion) in the bison rules.
Giving the translator the following text file, the parsing is completed without syntactical errors but when I cross the AST in pre-order for test purposes, it would seem that the crossing stops in the first child node of the list, and therefore does not cycle on the remaining children of the lists.
TEXT FILE IN INPUT:
import java.util.*;
class table {
int a;
int c;
}
class ball {
int a;
}
I put the grammar rules of bison involved in it:
Program
: ImportStatement ClassDeclarations { set_parse_tree($$ = program_new($1,$2,2));}
;
ImportStatement
: IMPORT LIBRARY SEMICOLON {$$ = import_new($2,1); printf("Type di import: %d \n", $$->type);}
| %empty {$$ = import_new(NULL,0); }
;
ClassDeclarations
: ClassDeclaration { $$ = list_new(CLASS_DECLARATIONS,$1,NULL,2); }
| ClassDeclarations ClassDeclaration { list_append( $$ = $1, list_new(CLASS_DECLARATIONS,$2,NULL,2)); }
;
ClassDeclaration
: CLASS NameID LBRACE FieldDeclarations RBRACE { $$ = classDec_new($2,$4,2); }
| PUBLIC CLASS NameID LBRACE FieldDeclarations RBRACE { $$ = classDec_new($3,$5,2);}
;
FieldDeclarations
: FieldDeclaration {$$ = list_new(FIELD_DECLARATIONS,$1,NULL,2); }
| FieldDeclarations FieldDeclaration { list_append( $$ = $1, list_new(FIELD_DECLARATIONS,$2,NULL,2)); }
;
FieldDeclaration
: VariableFieldDeclaration {$$ = fieldDec_new($1,NULL,NULL,3);}
| PUBLIC VariableFieldDeclaration {$$ = fieldDec_new($2,NULL,NULL,3);}
| MethodFieldDeclaration {$$ = fieldDec_new(NULL,$1,NULL,3);}
| ConstructorDeclaration {$$ = fieldDec_new(NULL,NULL,$1,3);}
;
VariableFieldDeclaration
: Type VariableDeclarations SEMICOLON {$$ = variableFieldDec_new($1,$2,2);}
;
VariableDeclarations
: VariableDeclaration {$$ = list_new(VARIABLE_DECLARATIONS,$1,NULL,2); }
| VariableDeclarations COMMA VariableDeclaration { list_append( $$ = $1, list_new(VARIABLE_DECLARATIONS,$3,NULL,2)); }
;
VariableDeclaration
: NameID {$$ = varDec_new($1,NULL,NULL,NULL,NULL,5);}
| NameID ASSIGNOP ExpressionStatement {$$ = varDec_new($1,$3,NULL,NULL,NULL,5);}
| NameID LSBRACKET RSBRACKET {$$ = varDec_new($1,NULL,NULL,NULL,NULL,5); }
| LSBRACKET RSBRACKET NameID {$$ = varDec_new($3,NULL,NULL,NULL,NULL,5); }
| NameID LSBRACKET RSBRACKET ASSIGNOP NEW Type LSBRACKET Dimension RSBRACKET {$$ = varDec_new($1,NULL,$6,$8,NULL,5); }
| LSBRACKET RSBRACKET NameID ASSIGNOP NEW Type LSBRACKET Dimension RSBRACKET {$$ = varDec_new($3,NULL,$6,$8,NULL,5); }
| NameID LSBRACKET RSBRACKET ASSIGNOP LBRACE VariableInitializers RBRACE {$$ = varDec_new($1,NULL,NULL,NULL,$6,5); }
| LSBRACKET RSBRACKET NameID ASSIGNOP LBRACE VariableInitializers RBRACE {$$ = varDec_new($3,NULL,NULL,NULL,$6,5); }
| NameID LSBRACKET RSBRACKET ASSIGNOP LBRACE RBRACE {$$ = varDec_new($1,NULL,NULL,NULL,NULL,5); }
| LSBRACKET RSBRACKET NameID ASSIGNOP LBRACE RBRACE {$$ = varDec_new($3,NULL,NULL,NULL,NULL,5); }
;
Type
: INT {$$ = typeId_new($1,1);}
| CHAR {$$ = typeId_new($1,1);}
| FLOAT {$$ = typeId_new($1,1);}
| DOUBLE {$$ = typeId_new($1,1);}
;
NameID
: ID {$$ = nameID_new($1, 1);}
;
In the general structure of a node of the ast there are:
the type of each node,
a union structure containing the different structures of each possible node,
an integer variable (numLeaf) which represents the maximum possible number of leaves for each parent node (it is passed from bison in semantic actions as the last parameter of the functions)
an array of pointers (leafVet) to structures that will have the number of leaves as the size and each location will contain a pointer to a possible child (if the child is not present it will be NULL).
These last two variables are used to manage the crossing of the tree. I will cycle on each vector to pass to the children of each node.
I think the problem refers mainly to the structures of the lists (ClassDeclarations, FieldDeclarations, VariableDeclarations...).
The structure of each list is as follows and this structure is part of the union of possible structures of each node.
STRUCT LIST:
struct {
int type;
struct ast_node *head; //pointer to the head of the list
struct ast_node *tail; //pointer to the tail of the list
} list;
The functions that refer to the creation of list nodes are the following:
static ast_node *newast(int type)
{
ast_node *node = malloc(sizeof(ast_node));
node->type = type;
return node;
}
ast_list *list_new(int type, ast_node *head, ast_list *tail, int numLeaf)
{
ast_list *l = newast(AST_LIST); //allocates memory for the AST_LIST type node
l->list.type = type;
l->list.head = head;
l->list.tail = tail;
l->numLeaf = numLeaf;
l->LeafVet[0] = head;
l->LeafVet[1] = tail;
return l;
}
void list_append(ast_list *first, ast_list *second)
{
while (first && first->list.tail)
{
first = first->list.tail;
}
if (first)
{
first->list.tail = second;
}
first->numLeaf = 2;
}
I think the error could be in the list_append function because when I run through the pre-order tree, it manages to enter the first leaf node of the lists but does not proceed with the remaining leaf nodes. Specifically, referring to the initial text file, the crossing stops after reaching the NameID node of VariableDeclaration (to be precise, it stops at the first variable of the first class) without giving any error. Immediately afterwards it should parse the second leaf node of fieldDeclarations as there is a second variable declaration (variableFieldDeclaration), but trying to print the nonzero leaf numbers of each list, I always get 1, so it would seem that the append of the lists do not work properly.
The error could also be in the crossing algorithm that I write below:
void print_ast(ast_node *node) //ast preorder
{
int leaf;
leaf = node->numLeaf;
printf("Num leaf: %d \n",leaf);
switch(node->type)
{
case AST_LIST:
break;
case AST_PROGRAM:
break;
case AST_IMPORT:
printf("Import: %s \n", node->import.namelib);
break;
case AST_CLASSDEC:
printf("name class: %s\n", node->classDec.nameClass->nameID.name);
break;
case AST_TYPEID:
break;
case AST_VARFIELDDEC:
break;
case AST_VARDEC:
break;
case AST_FIELDDEC:
break;
case AST_NAMEID:
printf("Il valore della variabile e': %s \n", node->nameID.name);
break;
default:
printf("Error in node selection!\n");
exit(1);
}
for (int i=0; i<leaf; i++)
{
if(node->LeafVet[i] == NULL ){
continue;
} else{
printf("%d \n", node->LeafVet[i]->type);
print_ast(node->LeafVet[i]);
}
}
}
I hope you can help me, thanks a lot.
I'm currently stuck on my project on creating a Fuseki Triple Store Browser. I need to visualize all the data from a TripleStore and make the app browsable. The only problem is that the QuerySolution leaves out the "< >" that are in the triplestore.
If I use the ResultSetFormatter.asText(ResultSet) it returns this:
-------------------------------------------------------------------------------------------------------------------------------------
| subject | predicate | object |
=====================================================================================================================================
| <urn:animals:data> | <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> | <http://www.w3.org/1999/02/22-rdf-syntax-ns#Seq> |
| <urn:animals:data> | <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1> | <urn:animals:lion> |
| <urn:animals:data> | <http://www.w3.org/1999/02/22-rdf-syntax-ns#_2> | <urn:animals:tarantula> |
| <urn:animals:data> | <http://www.w3.org/1999/02/22-rdf-syntax-ns#_3> | <urn:animals:hippopotamus> |
-------------------------------------------------------------------------------------------------------------------------------------
Notice that the some of the data contains the smaller/greater than signs "<" and ">". As soon as i try to parse the data from the ResultSet, it removes those sign, so that the data looks like this:
-------------------------------------------------------------------------------------------------------------------------------
| subject | predicate | object |
===============================================================================================================================
| urn:animals:data | http://www.w3.org/1999/02/22-rdf-syntax-ns#type | http://www.w3.org/1999/02/22-rdf-syntax-ns#Seq |
| urn:animals:data | http://www.w3.org/1999/02/22-rdf-syntax-ns#_1 | urn:animals:lion |
| urn:animals:data | http://www.w3.org/1999/02/22-rdf-syntax-ns#_2 | urn:animals:tarantula |
| urn:animals:data | http://www.w3.org/1999/02/22-rdf-syntax-ns#_3 | urn:animals:hippopotamus |
As you can see, the data doesn't contain the "<" and ">" signs.
This is how I parse the data from the ResultSet:
while (rs.hasNext()) {
// Moves onto the next result
QuerySolution sol = rs.next();
// Return the value of the named variable in this binding.
// A return of null indicates that the variable is not present in
// this solution
RDFNode object = sol.get("object");
RDFNode predicate = sol.get("predicate");
RDFNode subject = sol.get("subject");
// Fill the table with the data
DefaultTableModel modelTable = (DefaultTableModel) this.getModel();
modelTable.addRow(new Object[] { subject, predicate, object });
}
It's quite hard to explain this problem, but is there a way to keep the "< >" signs after parsing the data?
The '<>' are used by the formatter to indicate that the value is a URI rather than a string: so "http://example.com/" is a literal text value, whereas <http://example.com/> is a URI.
You can do the same yourself:
RDFNode node; // subject, predicate, or object
if (node.isURIResource()) {
return "<" + node.asResource().getURI() + ">";
} else {
...
}
But it's much easier to use FmtUtils:
String nodeAsString = FmtUtils.stringForRDFNode(subject); // or predicate, or object
What you need to do is get that code invoked when the table cell is rendered: currently the table is using Object::toString().
In outline, the steps needed are:
modelTable.setDefaultRenderer(RDFNode.class, new MyRDFNodeRenderer());
Then see http://docs.oracle.com/javase/tutorial/uiswing/components/table.html#renderer about how to create a simple renderer. Note that value will be an RDFNode:
static class MyRDFNodeRenderer extends DefaultTableCellRenderer {
public MyRDFNodeRenderer() { super(); }
public void setValue(Object value) {
setText((value == null) ? "" : FmtUtils.stringForRDFNode((RDFNode) value));
}
}
I am writing my DSL's Model inferrer, which extends from AbstractModelInferrer. Until now, I have successfully generated classes for some grammar constructs, however when I try to generate an interface the type inferrer does not work and I get the following Exception:
0 [Worker-2] ERROR org.eclipse.xtext.builder.BuilderParticipant - Error during compilation of 'platform:/resource/pascani/src/org/example/namespaces/SLA.pascani'.
java.lang.IllegalStateException: equivalent could not be computed
The Model inferrer code is:
def dispatch void infer(Namespace namespace, IJvmDeclaredTypeAcceptor acceptor, boolean isPreIndexingPhase) {
acceptor.accept(processNamespace(namespace, isPreIndexingPhase))
}
def JvmGenericType processNamespace(Namespace namespace, boolean isPreIndexingPhase) {
namespace.toInterface(namespace.fullyQualifiedName.toString) [
if (!isPreIndexingPhase) {
documentation = namespace.documentation
for (e : namespace.expressions) {
switch (e) {
Namespace: {
members +=
e.toMethod("get" + Strings.toFirstUpper(e.name), typeRef(e.fullyQualifiedName.toString)) [
abstract = true
]
members += processNamespace(e, isPreIndexingPhase);
}
XVariableDeclaration: {
members += processNamespaceVarDecl(e)
}
}
}
}
]
}
def processNamespaceVarDecl(XVariableDeclaration decl) {
val EList<JvmMember> members = new BasicEList();
val field = decl.toField(decl.name, inferredType(decl.right))[initializer = decl.right]
// members += field
members += decl.toMethod("get" + Strings.toFirstUpper(decl.name), field.type) [
abstract = true
]
if (decl.isWriteable) {
members += decl.toMethod("set" + Strings.toFirstUpper(decl.name), typeRef(Void.TYPE)) [
parameters += decl.toParameter(decl.name, field.type)
abstract = true
]
}
return members
}
I have tried using the lazy initializer after the acceptor.accept method, but it still does not work.
When I uncomment the line members += field, which adds a field to an interface, the model inferrer works fine; however, as you know, interfaces cannot have fields.
This seems like a bug to me. I have read tons of posts in the Eclipse forum but nothing seems to solve my problem. In case it is needed, this is my grammar:
grammar org.pascani.Pascani with org.eclipse.xtext.xbase.Xbase
import "http://www.eclipse.org/xtext/common/JavaVMTypes" as types
import "http://www.eclipse.org/xtext/xbase/Xbase"
generate pascani "http://www.pascani.org/Pascani"
Model
: ('package' name = QualifiedName ->';'?)?
imports = XImportSection?
typeDeclaration = TypeDeclaration?
;
TypeDeclaration
: MonitorDeclaration
| NamespaceDeclaration
;
MonitorDeclaration returns Monitor
: 'monitor' name = ValidID
('using' usings += [Namespace | ValidID] (',' usings += [Namespace | ValidID])*)?
body = '{' expressions += InternalMonitorDeclaration* '}'
;
NamespaceDeclaration returns Namespace
: 'namespace' name = ValidID body = '{' expressions += InternalNamespaceDeclaration* '}'
;
InternalMonitorDeclaration returns XExpression
: XVariableDeclaration
| EventDeclaration
| HandlerDeclaration
;
InternalNamespaceDeclaration returns XExpression
: XVariableDeclaration
| NamespaceDeclaration
;
HandlerDeclaration
: 'handler' name = ValidID '(' param = FullJvmFormalParameter ')' body = XBlockExpression
;
EventDeclaration returns Event
: 'event' name = ValidID 'raised' (periodically ?= 'periodically')? 'on'? emitter = EventEmitter ->';'?
;
EventEmitter
: eventType = EventType 'of' emitter = QualifiedName (=> specifier = RelationalEventSpecifier)? ('using' probe = ValidID)?
| cronExpression = CronExpression
;
enum EventType
: invoke
| return
| change
| exception
;
RelationalEventSpecifier returns EventSpecifier
: EventSpecifier ({RelationalEventSpecifier.left = current} operator = RelationalOperator right = EventSpecifier)*
;
enum RelationalOperator
: and
| or
;
EventSpecifier
: (below ?= 'below' | above ?= 'above' | equal ?= 'equal' 'to') value = EventSpecifierValue
| '(' RelationalEventSpecifier ')'
;
EventSpecifierValue
: value = Number (percentage ?= '%')?
| variable = QualifiedName
;
CronExpression
: seconds = CronElement // 0-59
minutes = CronElement // 0-59
hours = CronElement // 0-23
days = CronElement // 1-31
months = CronElement // 1-2 or Jan-Dec
daysOfWeek = CronElement // 0-6 or Sun-Sat
| constant = CronConstant
;
enum CronConstant
: reboot // Run at startup
| yearly // 0 0 0 1 1 *
| annually // Equal to #yearly
| monthly // 0 0 0 1 * *
| weekly // 0 0 0 * * 0
| daily // 0 0 0 * * *
| hourly // 0 0 * * * *
| minutely // 0 * * * * *
| secondly // * * * * * *
;
CronElement
: RangeCronElement | PeriodicCronElement
;
RangeCronElement hidden()
: TerminalCronElement ({RangeCronElement.start = current} '-' end = TerminalCronElement)?
;
TerminalCronElement
: expression = (IntLiteral | ValidID | '*' | '?')
;
PeriodicCronElement hidden()
: expression = TerminalCronElement '/' elements = RangeCronList
;
RangeCronList hidden()
: elements += RangeCronElement (',' elements +=RangeCronElement)*
;
IntLiteral
: INT
;
UPDATE
The use of a field was a way to continue working in other stuff until I find a solution. The actual code is:
def processNamespaceVarDecl(XVariableDeclaration decl) {
val EList<JvmMember> members = new BasicEList();
val type = if (decl.right != null) inferredType(decl.right) else decl.type
members += decl.toMethod("get" + Strings.toFirstUpper(decl.name), type) [
abstract = true
]
if (decl.isWriteable) {
members += decl.toMethod("set" + Strings.toFirstUpper(decl.name), typeRef(Void.TYPE)) [
parameters += decl.toParameter(decl.name, type)
abstract = true
]
}
return members
}
From the answer in the Eclipse forum:
i dont know if that you are doing is a good idea. the inferrer maps
your concepts to java concepts and this enables the scoping for the
expressions. if you do not have a place for your expressions then it
wont work. their types never will be computed
thus i think you have a usecase which is not possible using xbase
without customizations. your semantics is not quite clear to me.
Christian Dietrich
My answer:
Thanks Christian, I though I was doing something wrong. If it seems not to be a common use case, then there is no problem, I will make sure the user explicitly defines a variable type.
Just to clarify a little bit, a Namespace is intended to define variables that are used in Monitors. That's why a Namespace becomes an interface, and a Monitor becomes a class.
Read the Eclipse forum thread