Parsing text in parens using JParsec

Parsing text in parens using JParsec - java

I'm writing a parser for a DSL that uses the syntax (nodeHead: nodeBody). The problem is that nodeBody may contain parens, at some cases.
The between operator of JParsec should have been a good solution, yet the following code fails:
public void testSample() {
Parser<Pair<String,String>> sut = Parsers.tuple(Scanners.IDENTIFIER.followedBy(Scanners.among(":")),
Scanners.ANY_CHAR.many().source()
).between(Scanners.among("("), Scanners.among(")"));
sut.parse("(hello:world)");
}
It does not fail when I change ANY_CHAR to IDENTIFIER, so I assume the issue here is that the second parser in the tuple is too greedy. Alternatively, can I make JParsec apply the between parsers before it applies the body?
Any ideas are very much appriciated.

At the time I was asking, seems like there was no way to do that. However, a github fork-and-pull later, there is: reluctantBetween().
Big thanks to #abailly on the fast response.

If the syntax rule is that the last character will always be ")", you could probably do:
static <T> Parser<T> reluctantBetween(
Parser<?> begin, Parser<T> parser, Parser<?> end) {
Parser<?> terminator = end.followedBy(eof());
return between(begin, terminator.not().next(parser).many(), terminator);
}

Related

How to define ParameterType with code-dependent RegEx in Cucumber without TypeRegistry?

In Cucumber 7.4.1+ TypeRegistry is deprecated in favour of annotations.
Indeed, as of today, I have never used anything but #ParameterType to define my ParameterTypes. Searching for alternatives, TypeRegistry is the only one I have found - but if it is "on the way out", of course I'd rather not start using it now.
Given a construct like this I cannot use annotations because those cannot take static parameters:
enum SpecialDate implements Supplier<Date> {
TODAY { #Override public Date get() { return Date(); } },
// YESTERDAY, etc.
;
static String typeSafeRegEx() {
return Arrays.stream(Zeitpunkt.values())
.map(SpecialDate::specName)
.collect(Collectors.joining("|"));
}
static SpecialDate from(final String specName) {
return valueOf(StringUtils.upperCase(specName));
}
String specName() {
return StringUtils.capitalize(StringUtils.lowerCase(this.name()));
}
}
public class ParameterTypes {
// does not compile: "Attribute value must be constant"
#ParameterType("(" + SpecialDate.typeSafeRegEx() + ")")
public Date specialDate(final String specName) {
return SpecialDate.from(specName).get();
}
}
A so-specified regEx is nice, because it will only match values guaranteed to be mappable, so I need no additional error handling code beyond Cucumber's own. The list of allowed values is also maintenance-free (compared to a "classic" switch which would silently grow incorrect when adding new values).
The alternative would be to use an unsafe switch + default: throw, strictly worse because it has to be maintained manually.
(Or, I guess, to just valueOf whatever + wrap into a more specific exception, when it eventually fails.)
To me, Cucumber's native UndefinedStepException appears to be the best outcome on a mismatch, because everyone familiar with Cucumber will immediately recognise it, unlike a project-specific one.
I see that e.g. the ParameterType class is not deprecated but cannot seem to find information how to use it without TypeRegistry.
FWIW:
Updating the libraries or Java would not be an issue. (But downgrading is sadly not viable.)
Business Specialists want to write examples like [Today], [Today + 1], [Yesterday - 3], etc. If this can be realised more elegantly using a different approach, X/Y answers would also be welcome.
An example step looks like this:
And I enter into the field 'Start of Implementation' the <begin>
Examples:
| begin
| [Today]
| [Today + 1]

JavaCC creating custom Token class

I'm working on a school assignment for my compiler and interpreters course and our current task is to create a scanner and a set of tokens using JavaCC. I have a pretty solid understanding of how JavaCC works but my problem is finding resources online to help me out when I get stuck. I am working on creating a custom Token class, let's call it NewToken.Java. I know that the base Token class has an image variable and a kind variable but I want to implement my own variable "value". Furthermore I want to figure out how I can assign this value. I want the value variable to hold the literal value of what I scan, for example, my NewToken is being matched to the following
< IDENTIFIER:(< LETTER >)+ ( < LETTER > | < DIGIT >)* >
< #LETTER:["a" - "z"] >
< #DIGIT: ["0" - "9"] >
so something along the lines of Name123Name would get caught and when it does I want to store the string "Name123Name" into the 'value' variable of my NewToken object. I hope this makes sense, I am still new to JavaCC and may be calling things by there wrong name here.
public NewToken(){}
public NewToken(int kind){
this(kind,null);
}
public NewToken(int kind, String image){
this.kind=kind;
this.image=image;
this.value=image;
}
public String toString(){
return image;
}
public static Token newToken(int ofKind, String image){
switch(ofKind){
default : return new Token(ofKind, image);
}
}
public static Token newToken(int ofKind){
return newToken(ofKind, null);
}
}
Above is part of my code for the NewToken class, I have it extending Token and implementing java.io.serializable. I created by using the code generated for Token.java. I also have my variable declarations and my getValue() function which are not listed here to save space. I'm not looking for anyone to do my work for me I just need some guidance on how I would get this working, thank you in advance.

First off, I think the newToken routine should return objects of type NewToken rather than Token.
public static Token newToken(int ofKind, String image){
return new NewToken(ofKind, image);
}
public static Token newToken(int ofKind){
return new NewToken(ofKind, null);
}
(I don’t think you need that second method. But, I’m not completely sure, so I’ll leave it.)
It’s a bit unclear to me how you want value to differ from image, but I’m going to assume that you can compute the desired value for value from the image and the kind. And I’ll further assume that you have implemented this function as a static method.
private static String computeValue(int kind, String image) {...}
Delete the first two constructors and the remaining one should be:
private NewToken(int kind, String image){
this.kind = kind;
this.image = image;
this.value = computeValue( kind, image );
}

The answer that Professor Norvell is giving you is based on using a very old, obsolete version of JavaCC. The way he's suggesting you go about things is probably about the best way of doing it, if you were going to use the legacy JavaCC.
However, the most advanced version of JavaCC is JavaCC 21 and it handles this sort of use case straight out of the box in a very clean, elegant manner. See here for information on this specifically.
As you can see, you can put annotations in your grammar file that cause the various Token subclasses to be generated and used.
Also, JavaCC 21 has code injection that allows you to inject code directly into the any generated files, including the Token subclasses. That feature is not at all present in legacy JavaCC either. But using that, you could just inject your computeValue method right into the appropriate Token subclass.
INJECT NewToken :
{
private static String computeValue(int kind, String image) {...}
}
You put that in your grammar and the computeValue method just gets inserted into the generated NewToken.java file.
By the way, there is an article about JavaCC 21 that appeared recently on dzone.com.

How to reference attribute from .bnf parser in JFlex?

I'm using a .bnf parser to detect specific expressions and I'm using JFlex to detect the different sections of these expressions. My issue is, some of these expressions may contain nested expressions and I dont know how to handle that.
I've tried to include the .bnf parser in my JFlex by using %include, then referencing the expression in the relative macro using PARAMETERS = ("'"[:jletter:] [:jletterdigit:]*"'") | expression. This fails as JFlex reports the .bnf to be malformed.
Snippet of JFlex:
%{
public Lexer() {
this((java.io.Reader)null);
}
%}
%public
%class Lexer
%implements FlexLexer
%function advance
%type IElementType
%include filename.bnf
%unicode
PARAMETERS= ("'"[:jletter:] [:jletterdigit:]*"'") | <a new expression element>
%%
<YYINITIAL> {PARAMETERS} {return BAD_CHARACTER;} some random return
Snippet of .bnf parser:
{
//list of classes used.
}
expression ::= (<expression definition>)
Any input would be greatly appreciated. Thanks.

I've found the solution to my issue. In further depth, the problem was in both my grammar file and my flex file. To solve the issue, I recursively called the expression in the grammar file like so:
expression = (start value expression? end)
With the JFlex, I declared numerous states until I found a way to chain together and endless amount of expressions. Looks a little like this:
%state = WAITING_EXPRESSION
<WAITING_NEXT> "<something which indicates start of nested expression>" { yybegin(WAITING_EXPRESSION); return EXPRESSION_START; }

java regular expression partial replace

I am working with some legacy code that has a static method call which we need to remove from our source tree.
The existing code is as follows:
Logger.getInstance(JdkUtil.forceInit(SomeBusiness.class));
What we need to end up with is:
Logger.getInstance(SomeBusiness.class);
I've spent all day today trying to figure out how to do that replacement. Since I have very little experience with regular expressions, I have only been able to come up with a pattern that matches the source string.
The pattern JdkUtil.forceInit([a-zA-Z_0-9]*.class) finds matches on the input string I am providing. I've tested this at https://www.freeformatter.com/java-regex-tester.html
So if anyone can post a Java solution to this, I would really appreciate it.
Below is some Groovy code that I have so far. What I am missing is to how correctly replacement explained above.
String source = 'Logger.getInstance(JdkUtil.forceInit(RtpRuleEngineCompiledImpl.class))'
String regexpPattern = 'JdkUtil.forceInit\\([a-zA-Z_0-9\\)]*.class\\)'
String replaced = source.replaceFirst(regexpPattern, 'hello')
println replaced
When I run the above code I get the following output:
Logger.getInstance(hello)
Obviously 'hello' is just for testing.
Thanks in advance to anyone who can give me some suggestions.

You'll likely want to do something such as:
class StackOverflow {
public static void main(String[] args) {
String source = "Logger.getInstance(JdkUtil.forceInit(RtpRuleEngineCompiledImpl.class))";
String regexpPattern = "JdkUtil.forceInit\\(([a-zA-Z_0-9]*.class)\\)";
String replaced = source.replaceFirst(regexpPattern, "$1");
System.out.println(replaced);
}
}
Result:
Logger.getInstance(RtpRuleEngineCompiledImpl.class)
The capture group ($1) replaces the entire string which was within the parentheses.

Syntax error: insert "enum Identifier", insert "EnumBody", inset "}"

I coded an enum type which brings up the following Syntax errors when I run my created JUnit test for it:
java.lang.Error: Unresolved compilation problems:
Syntax error, insert "enum Identifier" to complete EnumHeaderName
Syntax error, insert "EnumBody" to complete EnumDeclaration
Syntax error, insert "}" to complete ClassBody
My enum type has static functions which for a particular String, returns an enum constant. Here is some of my code of the enum type:
public enum MusicType {
ACCIDENTAL, LETTER, OCTAVE, REST, DUR, CHORD, TUPLET;
public static MusicType is_accidental(String a){
if (a=="^" | a=="_"|a=="=")
return ACCIDENTAL;
else return null;
}
}
The rest of my static functions are very similar (i.e. is_letter, is_octave, etc.), although some use input.matches(regex) function instead of checking to see if an input it equals a particular string.
Here is the beginning of the JUnit test which tests the function dealing with the accidental constant:
public class MusicTypeTest {
#Test
public void accidentalTest(){
String sharp = "^";
String flat = "_";
String natural = "=";
assertEquals(MusicType.ACCIDENTAL, MusicType.is_accidental(sharp));
assertEquals(MusicType.ACCIDENTAL, MusicType.is_accidental(flat));
assertEquals(MusicType.ACCIDENTAL, MusicType.is_accidental(natural));
}
}
The other functions in my JUnit test which test all the enum static functions are coded similarly. I cannot figure out why I have these syntax errors (this is my first time coding an enum type). I've been coding in Eclipse and have not found any missing "}"s as of yet. I don't know if this has anything to do with the way I've written the test or the way I've declared my variables. Does anyone know why I have these syntax errors?

I had this same problem with Eclipse. It was a misleading syntax error message. It was due to a misplaced ";" after an annotation.
Double check your code ignoring the message.

I was getting this error while writing an Android app. All my brackets were closed; I was following an example from a different site. I ended up selecting the entire text for my code, cutting, saving, and pasting the code back. The error went away. It's very possible that Eclipse got stuck...

Both the enum type and the class that you have just posted have two opening braces ({) and only one closing brace (}). If I had to guess, I'd say you need to put one more closing brace at the end of each of these files.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Parsing text in parens using JParsec - java

At the time I was asking, seems like there was no way to do that. However, a github fork-and-pull later, there is: reluctantBetween(). Big thanks to #abailly on the fast response.

If the syntax rule is that the last character will always be ")", you could probably do: static <T> Parser<T> reluctantBetween( Parser<?> begin, Parser<T> parser, Parser<?> end) { Parser<?> terminator = end.followedBy(eof()); return between(begin, terminator.not().next(parser).many(), terminator); }

Related

How to define ParameterType with code-dependent RegEx in Cucumber without TypeRegistry?

JavaCC creating custom Token class

How to reference attribute from .bnf parser in JFlex?

java regular expression partial replace

Syntax error: insert "enum Identifier", insert "EnumBody", inset "}"

Categories

Resources