Now I am trying to match some patterns from a String containing elasticsearch's structured bulk requests. Here is an example:
index {[event_20191209][event][null], source[{"haha":"haha","jaja":"jaja"}]}, update {[event_20191209][event][xxx], doc_as_upsert[false], doc[index {[null][_doc][null], source[{"haha":"haha","jaja":"jaja"}]}], scripted_upsert[false], detect_noop[true]}, delete {[event_20191208][_doc][sjdos]}, update {[event_20191209][event][yyy], doc_as_upsert[false], upsert[index {[null][_doc][null], source[{"haha":"haha","jaja":"jaja"}]}], scripted_upsert[false], detect_noop[true]}
My goal is to match every separate request out of the bulk requests string, i.e to get strings like:
index {[event_20191209][event][null], source[{"haha":"haha","jaja":"jaja"}]},
update {[event_20191209][event][xxx], doc_as_upsert[false], doc[index {[null][_doc][null], source[{"haha":"haha","jaja":"jaja"}]}], scripted_upsert[false], detect_noop[true]},
delete {[event_20191208][_doc][sjdos]},
update {[event_20191209][event][yyy], doc_as_upsert[false], upsert[index {[null][_doc][null], source[{"haha":"haha","jaja":"jaja"}]}], scripted_upsert[false], detect_noop[true]}
And my pattern expression is [a-z]+\s\{.+?\}[,\w\t\r\n]+? which works fine on a Javascript based regular expression online tester like below:
However, when I copied this pattern expression to my Java code, the output was not what I expected. It was like this:
So I realized there exists some differences between Javascript and Java regular expression engine, but I cannot figure out how to update my expression so that it could work well in Java after so much coding and googling.
I would be so grateful if someone could give me some favor or hint for this.
After a short nap, I found epiphany. I was a fool in the morning....
The workaround is so easy to implement. Elasticsearch has well overridden toString() for us.
At first glance, I wouldn't suggest using regex right away. It looks like those lines follow some kind of pattern that you could parse and split up first.
After that, if you're talking about regex, I'd try:
Taking a look at the java regex format: https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html
How about using an online java regex tool instead?
I'm using IntelliJIdea community addition search and replace across files dialog for executing the regex below.
I'm working on cleaning up a java project has recently moved to Java 11, and I would like to replace the following java code
= new Integer(varname);
where varname can be a variable name or a literal int, and then replace the whole match and end up with the following
= varname;
I tried to use IntelliJ find and replace across files using regex and came up with the following
new\sInteger\(([a-zA-Z0-9]+)\)
Which seems to work, it gives the right results, but when I try to use the back reference like
$1
I actually get
= new varname;
where I would have expected the 'new ' keyword and space to have been overwritten as well as the 'Integer('
I have trawled through the answers to similar questions and experimented with grouping and boundaries all to no avail, can anyone help to remove the 'new ' characters? .
I try to create one regular expression allows find all System.out.println or print inside all my java classes.
I need to delete all outputs.
Thanks.
"System\\.out\\.println"
use this regex and replace all with empty string.
Not exactly what you asked for but if you are using eclipse to do your development then this would be your best approach.
If you want to search in files: Ctrl+H and then choose tab File Search. Enter your search parameter and it shows all the files where sysout is used. Hope it helps...
I am working on a plugin. I will parse HTML files. I have a naming convention like that:
<!--$include="a.html" -->
or
<!--$include="a.html"-->
is similar
According to this pattern(similar to server side includes) I want to search an HTML file.
Question is that:
Find that pattern and get value (a.html at my example, it is variable)
It should be like:
while(!notFinishedWholeFile){
fileName = findPatternFunc(htmlFile)
replaceFunc(fileName,something)
}
PS: Using regex at Java or implementing it different(as like using .indexOf()) I don't know which one is better. If regex is good at this situation by performence I want to use it.
Any ideas?
You mean like this?
<!--\$include=\"(?<htmlName>[a-z-_]*).html\"\s?-->
Read a file into a string then
str = str.replaceAll("(?<=<!--\\$include=\")[^\"]+(?=\" ?-->)", something);
will replace the filenames with the string something, then the string can be written back to the file.
(Note: this replaces any text inside the double quotes, not just valid filenames.)
If you want only want to replace filenames with the html extension, swap the [^\"]+ for [^.]+.html.
Using regex for this task is fine performance wise, but see e.g.
How to use regular expressions to parse HTML in Java? and Java Regex performance etc.
I have used that pattern:
"<!--\\$include=\"(.+)(.)(html|htm)\"-->"
Is there a tool to convert a regex from one popular language's syntax to another? For example a Python-style regex to a Java-style regex?.
Or at least, has someone put together a set of rules to do these conversions?
And obviously some constructs won't be able to convert.
Go to this article, and follow the link to "Regex info's comparison of Regex flavors", that got me to a tool called RegexBuddy, which sounds like it might do what you want.
Yes there is a Windows tool that will do this: RegexBuddy