Finding multiple groups in Java regex for simple option parser - java

I need to modify this regex to find multiple group matches:
(?:--)(?<key>[^\s=]+)(?:(?<assign> *[ =] *)(?! --)(?<value>"[^"]*"|\S+))?
In Java:
"(?:--)(?<key>[^\\s=]+)(?:(?<assign> *[ =] *)(?! --)(?<value>\"[^\"]*\"|\\S+))?"
This matches the following correctly:
--key=value
--key=--value
--key value
--flag
--key="--value"
--key "--value"
--key=value --foo=bar
--key=value --foo=bar --flag
But it fails if --flag comes before any other options:
--key=value --flag --foo=bar
I've been trying to modify the negative lookahead between the assign and value capture groups without success so far. The value captured for flag ends up being --foo=bar instead of null.
Any expert recommendations on how to solve this?

I managed to fix the regex. The website https://regexr.com/ was invaluable.
The fixed regex is:
(?<prefix>--)(?<key>[^\s=]+)(?:(?! --)(?<assign> *[ =] *)(?! --)(?<value>"[^"]*"|\S+))?
Here's the Java class and unit test:
https://gist.github.com/kirklund/845baf340a1999a57db9e59e6ba40ce0

Related

Regular Expression : Multiline check problem

Hello i have problem with this regexp
!
interface TenGigabitEthernet 1/49
description Uplink
no ip address
switchport
no shutdown
!
interface TenGigabitEthernet 1/50
no ip address
shutdown
!
interface TenGigabitEthernet 1/51
no ip address
shutdown
!
i tried this regexp (interface) ((.\s.)+) but it is not working becuse it match "interface" and the rest of text
I need to catch in first group "interface" and in the second i need all until first occur of "!"
so for example:
first group:
interface
second group:
TenGigabitEthernet 1/51
no ip address
shutdown
How i can do this?
Try this:
(interface)\s+([^!]+)
Here Is Demo
Use this:
(interface)\s*([^!]+) /g
The first group captures the hard-coded interface. The second group captures everything other than !, by skipping the leading whitespaces, if any. The global flag /g ensures all matches.
Demo
If the content itself can contain a !, you could check for a ! at the start of the line and repeat matching all lines until you encounter a ! at the start.
^(interface)\s*(.*(?:\n(?!!).*)*)
In Java
String regex = "^(interface)\\s*(.*(?:\\n(?!!).*)*)";
Regex demo

Jenkins Console section: What Java regex will trigger on string ERROR but not on string %%ERRORLEVEL%%?

I am using the Jenkins console sections plugin [1] on a windows server. It is excellent in order to make a nice left navbar on my logs.
Positively, I would like any error message to cause a section header, eg;
Assert-PathExstsNotTooLong : ERROR, The path does not exist: E:\P...
...
Oops! Error, please do not do that.
Negatively, I would like to be able to avoid having spelled-out execution templates cause a new section header, eg the below.
[workspace] $ cmd.exe /C " c:\Windows\Microsoft.NET\Framework64\v4.0.30319\MSBuild.exe /p:Configuration=Debug /p:VisualStudioVersion=12.0 "E:\Program Files (x86)\Jenkins\jobs\M.sln"
Using references here on SO [2] and on the tester you recommended [3], I came up with the following, but it is not working?
^(?=(.*([Ee][Rr][Rr][Oo][Rr] ).*))(?!(%%ERRORLEVEL%%))
Using Regex101's amazing tester, with JS flavor, I used the above as input and had these test strings and outputs. The second line of match info perhaps explains my issue but I do not understand it.
test-strings =
help error you should see me
i am %%errorlevel%% again
i am not a section
match-info;
1. `help error you should see me`
2. `error `
Any tips?
thank you!
1.[] ;This plugin uses Java Regex, per its docs ; ; ; ; X.Collapsing Console Sections Plugin - Jenkins - Jenkins Wiki ; ; https://wiki.jenkins-ci.org/display/JENKINS/Collapsing+Console+Sections+Plugin
2.[] ; An example regex on characters, not strings, to avoid; ; ; ; X.java - Regular expression include and exclude special characters - Stack Overflow ; ; Regular expression include and exclude special characters
3.[] ; ; ; ; ; X.Online regex tester and debugger: JavaScript, Python, PHP, and PCRE ; ; https://www.regex101.com/#javascript
(I can't add comments yet, otherwise I'd ask directly, but your example of a spelled-out message template doesn't include the text %%ERRORLEVEL%%, but I assume that it's meant to be a string with %%ERRORLEVEL%% somewhere in the middle of it. Also, as the example isn't quite right, I can't tell exactly what you mean by "not working")
Your problem is that your regex matches ERROR_ (with a space) anywhere in the text, except where the text is exactly %%ERRORLEVEL%%. I think that instead you could write:
^(?=(.*([Ee][Rr][Rr][Oo][Rr])))(?!.*(%%ERRORLEVEL%%)).*
Do you really need to only match ERROR_ (with a space) as opposed to ERROR (whether or not it has a space)? If the former, then you are already excluding %%ERRORLEVEL%%, and you could just use .*(?i:ERROR ).* as the full regex.
The Collapsing Console Sections Plugin uses Java regular expressions, so you can use (?i:ERROR) to match ERROR case-insensitively.
You need a trailing .* before and after your negative-lookahead atom for %%ERRORLEVEL%%, otherwise it will only exclude an exact match
The documentation for the plugin doesn't say whether the pattern has to match a line completely, or if it just matches text within the line. If it matches the line completely, the leading ^ is unnecessary, but won't be doing any harm.
You've got capturing brackets around ERROR and %%ERRORLEVEL%%. If you're not doing anything with that text, then those brackets are unnecessary.
The following regex will match any line with any of ERROR, Error, error etc in it, except lines with any of %%ERRORLEVEL%%, %%ErrorLevel%%, %%errorlevel%% etc.
^(?=.*(?i:ERROR))(?!.*(?i:%%ERRORLEVEL%%)).*

Regex exclude specific subdomain (Java)

I want to exclude a specific subdomain from a regex.
I have searched and tryed out different regex. But non worked for me.
The normal regex looks like this:
https?:\/\/((localhost(\:\d+)?)|([a-z\-\.]*\.)?(gaga.ch|gugus.ch))
To exclude a subdomain with name admin in gugus.ch I added:(^(?!.*admin).*)
So the whole regex looks like:
https?:\/\/((localhost(\:\d+)?)|([a-z\-\.]*\.)?(gaga.ch|(^(?!.*admin).*)gugus.ch))
So it should let through http://www.gugus.ch
But NOT http://admin.gugus.ch
This does not work. What I'am doing wrong?
thx Mike
Try this regex:
https?://((localhost(:\d+)?)|([a-z.-]*\.)?(gaga\.ch|(?<!\badmin\.)gugus\.ch)
(?<!\badmin\.) is a negative lookbehind to fail the match if gugus.ch is preceded by admin.

Regular Expression in burp

I'm using a forward proxy called Burp and would like to see only results from google in my site scope.
What will be the regex for if i want to see *.google.* in my result
So sample output can be
www.google.com
drive.google.com
google.in
and so on
This should work for you:
^.*?google\..*$
Will match anything before and after .google.
^.*\.domain\.com$
^.*\.test\.domain\.com$
^ -> Signifies beginning of the regex
.* -> accept anything
. -> Escape sequence for dot
$ -> End Regex

RegEx to match ends with

I need to write regex in java to match domain and subdomain(.domain.com).
Regex should return true for
domain.com
m.domain.com
abc.domain.com
www.domain.com
but returns false for
abcdomain.com
1domain.com
I try to match domain.com and and if preceding character is present then it must be .
I tried various options but it is failing in one or other test cases.
(^|.*?\.)domain\.com
Try this. See demo.
http://regex101.com/r/lB2sH2/1
Try this:
(\.|^)domain.com$
The first part means that there should be a . or nothing
and the $ means, "ends with"
You can try:
(^|\.)domain\.com$
but Java mostly handles only full-line matches, so:
(.+\.)?domain\.com
or you can use the .endWith() method in Java code:
if (domain.equals("domain.com") || domain.endsWith(".domain.com")) {
// do something...
}
I think you want something like this,
(?:\\w+\\.?)?domain\\.com
DEMO
try this regex
\bdomain\.com$
http://rubular.com/r/QG0FtVWtm6
If you don't know what "domain.com" is going to be, this regex below should give you just the subdomain of whatever domain you are looking for. Matches your specifications, including domains that look like abc.net
([a-z]+)(?=\.[a-z]+\.)
DEMO

Categories