In a csv file I've something like this
term
testing
I want to split testing into characters. I want something like this :
.feed(Feeders.search)
.foreach("${term}".toList, "search") {
exec(http("Auto Complete")
.get("${baseUrlHttps}/search/autocomplete")
.queryParam("term", "${search}")
.check(status is 200)
.check(jsonPath("$..products[0].code").optional.saveAs("code"))).pause(MIN_PAUSE, MAX_PAUSE)
}
The above code is not working as I wanted, it's splitting "${term}" into characters though I wanted to convert word "testing" which is in csv in to characters. Is there any workaround for it ?
That's not how autocomplete works. You're not posting chars by chars, you're reposting with one more char. Eg, you'll be posting "test", then "testi" then "testin" and finally "testing" (there's usually a minimum length.
exec { session =>
val term = session("term").as[String]
val parts = for (i <- 3 to term.size) yield term.substring(0, i)
session.set("parts", parts)
}
.foreach("${parts}", "search") {
exec(http("Auto Complete")
.get("${baseUrlHttps}/search/autocomplete")
.queryParam("term", "${search}")
.check(status is 200)
.check(jsonPath("$..products[0].code").optional.saveAs("code"))).pause(MIN_PAUSE, MAX_PAUSE)
}
Related
I have sample jsonNode data - Inputstr =
{
"a.b.c.d.e":"123",
"a[0].b.c.d[0].e":"123",
"a[0].b.c.d[1].e":"123",
"a[1].b.c.d[0].e":"123",
"a[1].b.c.d[1].e":"123",
"d.e.f"="789",
"x.y.z"="789"
}
I want to extract the keys having data in format a[0-9*].b[0-9*].c[0-9*].d[0-9*].e[0-9*].
Basically, the output should return me, 0 or more occurrences
[ a.b.c.d.e , a[0].b.c.d[0].e, a[0].b.c.d[1].e, a[1].b.c.d[0].e, a[1].b.c.d[1].e ].
So, what i did was
val json = ObjectMapper.readTree(Inputstr)
val itr = json.fieldNames
Now on this iterator of keys i want to create a generic regex which returns me the above output.
I tried but not working
val regex = """a\[[0-9\]]*.b\[[0-9\]]*.c\[[0-9\]]*.d\[[0-9\]]*.e\[[0-9\]]*""".r
while(itr.hasNext())
{
val str= itr.next()
regex.findAllIn(str)
}
I am stuck in creating the regex basically which can take [0-9]*, it should check for both the braces [ ] as well as presence of a digit from 0 to 9 inside the braces. Even if none exists, it should return me a.b.c.d.e as well.
I hope it makes sense.
Please let me know if any questions.
a(?:\[\d])?\.b(?:\[\d])?\.c(?:\[\d])?\.d(?:\[\d])?\.e(?:\[\d])?
Should do the job, I included the [0] part inside of a non matching group that can be optional using ?
I would like to split a character by spaces but keep the spaces inside the quotes (and the quotes themselves). The problem is, the quotes can be nested, and also I would need to do this for both single and double quotes. So, from the line this "'"is a possible option"'" and ""so is this"" and '''this one too''' and even ""mismatched quotes" I would like to get [this, "'"is a possible option"'", and, ""so is this"", and, '''this one too''', and, even, ""mismatched quotes"].
This question has already been asked, but not the exact question that I'm asking. Here are several solutions: one uses a matcher (in this case """x""" would be split into [""", x"""], so this is not what I need) and Apache Commons (which works with """x""" but not with ""x"", since it takes the first two double quotes and leaves the last two with x). There are also suggestions of writing a function to do so manually, but this would be the last resort.
You can achieve that with the following regex: ["']+[^"']+?["']+. Using that pattern you retrieve the indices where you want to split like this:
val indices = Regex(pattern).findAll(this).map{ listOf(it.range.start, it.range.endInclusive) }.flatten().toMutableList()
The rest is building the list out of substrings. Here the complete function:
fun String.splitByPattern(pattern: String): List<String> {
val indices = Regex(pattern).findAll(this).map{ listOf(it.range.start, it.range.endInclusive) }.flatten().toMutableList()
var lastIndex = 0
return indices.mapIndexed { i, ele ->
val end = if(i % 2 == 0) ele else ele + 1 // magic
substring(lastIndex, end).apply {
lastIndex = end
}
}
}
Usage:
val str = """
this "'"is a possible option"'" and ""so is this"" and '''this one too''' and even ""mismatched quotes"
""".trim()
println(str.splitByPattern("""["']+[^"']+?["']+"""))
Output:
[this , "'"is a possible option"'", and , ""so is this"", and , '''this one too''', and even , ""mismatched quotes"]
Try it out on Kotlin's playground!
Writing a pretty-printer for legacy code in an older language. The plan is for me to learn parsing and unparsing before I write a translator to output C++. I kind of got thrown into the deep end with Java and ANTLR back in June, so I definitely have some knowledge gaps.
I've gotten to the point where I'm comfortable writing methods for my custom listener, and I want to be able to pretty-print the comments as well. My comments are on a separate hidden channel. Here are the grammar rules for the hidden tokens:
/* Comments and whitespace -- Nested comments are allowed, each is redirected to a specific channel */
COMMENT_1 : '(*' (COMMENT_1|COMMENT_2|.)*? '*)' -> channel(1) ;
COMMENT_2 : '{' (COMMENT_1|COMMENT_2|.)*? '}' -> channel(1) ;
NEWLINES : [\r\n]+ -> channel(2) ;
WHITESPACE : [ \t]+ -> skip ;
I've been playing with the Cymbol CommentShifter example on p. 207 of The Definitive ANTLR 4 Reference and I'm trying to figure out how to adapt it to my listener methods.
public void exitVarDecl(ParserRuleContext ctx) {
Token semi = ctx.getStop();
int i = semi.getTokenIndex();
List<Token> cmtChannel = tokens.getHiddenTokensToRight(i, CymbolLexer.COMMENTS);
if (cmtChannel != null) {
Token cmt = cmtChannel.get(0);
if (cmt != null) {
String txt = cmt.getText().substring(2);
String newCmt = "// " + txt.trim(); // printing comments in original format
rewriter.insertAfter(ctx.stop, newCmt); // at end of line
rewriter.replace(cmt, "\n");
}
}
}
I adapted this example by using exitEveryRule rather than exitVarDecl and it worked for the Cymbol example but when I adapt it to my own listener I get a null pointer exception whether I use exitEveryRule or exitSpecificThing
I'm looking at this answer and it seems promising but I think what I really need is an explanation of how the hidden channel data is stored and how to access it. It took me months to really get listener methods and context in the parse tree.
It seems like CommonTokenStream.LT(), CommonTokenStream.LA(), and consume() are what I want to be using, but why is the example in that SO answer using completely different methods from the ANTLR book example? What should I know about the token index or token types?
I'd like to better understand the logic behind this.
Okay, so I can't answer how AnTLR stores its data internally, but I can tell you how to access your hidden tokens. I have tested this on my computer using AnTLR v4.1 for C# .NET v4.5.2.
I have a rule that looks like this:
LineComment
: '//' ~[\r\n]*
-> channel(1)
;
In my code, I am getting the entire raw token stream like this:
IList<IToken> lTokenList = cmnTokenStream.Get( 0, cmnTokenStream.Size );
To test, I printed the token list using the following loop:
foreach ( IToken iToken in lTokenList )
{
Console.WriteLine( "{0}[{1}] : {2}",
iToken.Channel,
iToken.TokenIndex,
iToken.Text );
}
Running on this code:
void Foo()
{
// comment
i = 5;
}
Yields the following output (for the sake of brevity, please assume I have a complete grammar that is also ignoring whitespace):
0[0] : void
0[1] : Foo
0[2] : (
0[3] : )
0[4] : {
1[5] : // comment
0[6] : i
0[7] : =
0[8] : 6
0[9] : ;
0[10] : }
You can see the channel index is 1 only for the single comment token. So you can use this loop to access only the comment tokens:
int lCommentCount = 0;
foreach ( IToken iToken in lTokenList )
{
if ( iToken.Channel == 1 )
{
Console.WriteLine( "{0} : {1}",
lCommentCount++,
iToken.Text );
}
}
Then you can do your whatever with those tokens. Also works if you have multiple streams, though I will caution against using more than 65,536 streams. AnTLR gave the following error when I tried to compile a grammar with a token rule redirect to stream index 65536:
Serialized ATN data element out of range.
So I guess they're only using a 16-bit unsigned integer to index the streams. Wierd.
I have a large File ~120MB which contains UTF 8 encoded Strings and I need to search for certain words in this file.
The format of the file looks like this:
[resource]<label>[resource]<label>[resource]<label>... including braces as one huge line so I can read it fast into memory.
I search only in the labels and return the labels and resources where a label contains one or more of the key words. Both the labels and the key words are in lower case.
Currently I load the whole file and create a list of Strings. Each entry in this list contains a pair of resource and label in the format [resource]<label>. And the size of this list is approximately 3,000,000. I "iterate" through this list with a tail recursive function and look if my labels contains one of the key words. This is quite fast (<800ms) but this search needs a lot of Memory and CPU-Power
My searchfunction looks like this
#tailrec
def search2( l: List[String], list: List[(String, String)]): List[(String, String)] = {
l match {
case Nil => list
case a :: as => {
val found = keyWords.foldRight(List.empty[(String, String)]) { (x, y) =>
if (a.contains(x)) {
val split = a.split("<")
if (split.size == 2) { (split(0).replace("[", "").replace("]", ""), split(1)) :: y }
else { y }
} else { y }
}
search2(as, found ::: list)
}
}
}
search2(buffer, Nil) //buffer is the list with my 3,000,000 elements
The search needs to be really fast (< 2 seconds). I already tried the MappedByteBuffer but the UTF 8 encoding made it quite difficult to search for a byte sequence and it was really slow (but maybe my search function was just bad).
If needed I could change the format or even split labels and resources into two different files.
You do not need to reparse the file every time you search for an element.
Read your file once for all and put the words in a Map[String, Set[String]].
Something like:
val allWords: Map[String, Seq[String]] =
Source.fromFile(file)
.getLines()
.head
.split(extractLabelResources)
.groupBy { case (label, resource) => label }
.mapValues(_.toSeq)
def extractLabelResources(line: String): Array[(String, String)] = {
// ...
}
def search(word: String): Set[String] = allWords.getOrElse(word, Set.empty)
I want to replace i# (look that the example inputs/outputs) with the value of an array, but I'm not sure how to do this with Java+Regex.
Assume you have an array with: [3,2,1,0]
Example inputs:
i0
i1^2
(i1+2)+5
2*5+i1
i1+i2-i3
1+2
Example output:
3 [why? input is i0 and index 0 = 3 in the array]
2^2
(2+2)+5
2*5+2
2+1-0
1+2
Regex is here:
http://rubular.com/r/KXbCQnbs8K
REGEX = i{1}(\d+)
Code:
private String replace(String input){
StringBuffer s = new StringBuffer();
Pattern regex = Pattern.compile(REGEX);
Matcher m = regex.matcher(input);
if( !m.find(0) ){
return input;
}else{
m.reset();
}
while (m.find() ){
m.appendReplacement(s, getRealValue(m.group(1)) );
}
return s.toString();
}
private String getRealValue(String val){
int value = Integer.parseInt(val);
return String.valueOf(array.get(value));
}
Assume i#s given are always valid. My code works for some cases, but fails in most. Any help? Thanks!
EDIT:
I'm not sure how to tell it to add the last part (for example: +5 in i0+5).
i0 -- works
i1^2 -- doesn't work
(i1+2)+5 -- doesn't work
2*5+i1 -- works
i1+i2-i3 -- doesn't work
1+2 -- works
1+i2 -- works
I want to modify the regex to "i{1}(\d+)(.*)"
if(lastMatch()){ //if last match is true
s += m.group(2) //concat the last group (ie. "+5" in "i0+5")
}
But I don't know the correct syntax for that.
So it fails... what is it about the output that is being produced that is wrong? That's a very useful bit of information that you've neglected to mentioned. In the future you should try to think more about why the output is wrong, this will help you to figure what the program is doing wrong.
But by the looks of it you've forgotten to use Matcher.appendTail(StringBuffer) after you've done all the replacements. appendTail appends any remaining characters after the last match eg. "i0[this bit]".
I assume the wrong output was
i1^2 -> 2
(i1+2)+5 -> (2
Looking at this it would have been much faster to figure out what was going wrong. It's forgetting to add last bit of String at the end. Let's find a way to sort this out or read the API to see if there's a method that does it in one simple step for me.
Example code
while (m.find() ){
m.appendReplacement(s, getRealValue(m.group(1)) );
}
m.appendTail(s); // you missed out this line
return s.toString();