I am totally new to reg-ex and I want to get validation for the string for valid combination of logical operators like ( ! , & , ( , ) , | ) . for Example if & | combined than it should be invalid as AND OR should come together. likewise possible invalid combination are &|, |& , (), !& ,&! etc
like example of below String
1. (ABC)&(DFG)|!(ZXC) - pass because all operators are correctly combined
2. !(ABC|DKJ)&VBN - pass
3. !(ADF&(!&(BER|CTY))|DGH) = failed as !& combined
4. !(ABC&DKJ)&|VBN - failed as & | combined
I know their several ways like I can use String's contains method to get check and reject if not passed the validation. But I am looking for solution through reg-ex in java
Just to avoid matching invalid operator combos you can use negative lookahead regex like this:
^(?!.*?(&\\||\\|&|\\(\\)|!&|&!))
Use it with MULTILINE option like this for multiline inputs:
Pattern p = Pattern.compile( "(?m)^(?!.*?(&[!|]|[(|]&|\\(\\)))" );
RegEx Demo
For using it with a string input you can do:
boolean value = input.matches( "(?!.*?(&[!|]|[(|]&|\\(\\))).+" );
Related
I'm using Grails for my web app project. I know the createCriteria method can perform search on existing entries in database. Let's say I have a domain "some_domain" which includes a string variable "domain_string". I want to find out all "domain_strings" that contain either a 7-digit or 10-digit number starting with "1" or "7". (e.g. domain_string1 = ".........1234567.......", domain_string2 = ".......7192839265......", etc)
In my code:
some_domain.createCriteria().list() {
rlike("domain_string", "%/^(1|7){7,10}/%")
}
I've used java regex here and the grails doc tells me that rlike is for regex input. But I can't get the exact output by the code because I'm not familiar with the groovy syntax. Any suggestions for that? Thanks a lot in advance.
You can use
rlike("domain_string", /([^0-9]|^)[17][0-9]{6}([0-9]{3})?([^0-9]|$)/)
See the regex demo.
Details:
([^0-9]|^) - either a non-digit char or start of string
[17] - 1 or 7
[0-9]{6} - any six digits
([0-9]{3})? - an optional occurrence of three digits
([^0-9]|$) - either a non-digit char or end of string.
Groovy regex by java native rules would look like:
def RE = /\D*[17]\d+\D*/
def domain_strings = [ ".........1234567.......", ".......7192839265......", ".......3192839265......", , ".......4192839265......" ]
domain_strings.each{
boolean match = it ==~ RE
println "$it matches? -> $match"
}
prints:
.........1234567....... matches? -> true
.......7192839265...... matches? -> true
.......3192839265...... matches? -> false
.......4192839265...... matches? -> false
You should check your DB SQL dialect if can consume such expressions as-is.
I have problem with matching groups that contain lookahead expression. I don't know why this expressions doesn't work:
"""((?<=^)(.*)(?=\s\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\s%))((?<=[\w:]\s)(\w+)(?=\s[cr]))"""
When I compile them separately, for example:
"""(?<=^)(.*)(?=\s\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\s%)"""
I get the correct result
My sample text:
May 5 23:00:01 10.14.3.10 %ASA-6-302015: Built inbound UDP connection
Expressions have been checked with this tool: http://regex-testdrive.com/en/dotest
My Scala code:
import scala.util.matching.Regex
val text = "May 5 23:00:01 10.14.3.10 %ASA-6-302015: Built inbound UDP connection"
val regex = new Regex("""((?<=^)(.*)(?=\s\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\s%))((?<=[\w:]\s)(\w+)(?=\s[cr]))""")
val result = regex.findAllIn(text)
Does anyone know solution of this problem?
Multiple matching
You may fix the pattern as
^.*?(?=\s\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\s%)|(?<=[\w:]\s)\w+(?=\s[cr])
See the regex demo. The main point is to introduce the | alternation operator to match either of the 2 subpatterns. Note you do not need to put the ^ start of string anchor into a lookbehind, as ^ is already a zero-width assertion. Also, there are too many groupings that you do not seem to use any way. Also, to match a literal dot you need to escape it (. -> \.).
To obtain the multiple matches, you may use the following code snippet:
val text = "May 5 23:00:01 10.14.3.10 %ASA-6-302015: Built inbound UDP connection"
val regex = """^.*?(?=\s\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}\s%)|(?<=[\w:]\s)\w+(?=\s[cr])""".r
val result = regex.findAllIn(text)
result.foreach { x => println(x) }
// => May 5 23:00:01
// UDP
See the Scala online demo.
Note that once a pattern is used with .FindAllIn, it is not anchored by default, so you will get all the matches there are in the input string.
Capturing groups
Another approach you may use is matching the whole line while capturing the necessary bits with capturing groups:
val text = "May 5 23:00:01 10.14.3.10 %ASA-6-302015: Built inbound UDP connection"
val regex = """^(.*?)\s+\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\s%.*[\w:]\s+(\w+)\s+[cr].*""".r
val results = text match {
case regex(date, protocol) => Array(date, protocol)
case _ => Array[String]()
}
// Demo printing
results.foreach { m =>
println(m)
}
See another Scala demo. Since match requires a full string match, .* is added at the end of the pattern, and only relevant pairs of unescaped (...) are kept in the pattern. See the regex demo here.
your matches are not next to each other,
try this:
"""((?<=^)(.*)(?=\s\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\s%)).*((?<=[\w:]\s)(\w+)(?=\s[cr]))"""
I just added the .* between them, it works on the link you sent :)
I have a string which I want a string to parse via Java or Python regexp:
something (\var1 \var2 \var3 $var4 #var5 $var6 *fdsfdsfd #uytuytuyt fdsgfdgfdgf aaabbccc)
The number of var is unknown. Their exact names are unknown. Their names may or may not start with "\" or "$", "*", "#" or "#" and there're delimited by whitespace.
I'd like to parse them separately, that is, in capture groups, if possible. How can I do that? The output I want is a list of:
[\var1 , \var2 , \var3 , $var4 , #var5 , $var6 , *fdsfdsfd , #uytuytuyt , fdsgfdgfdgf , aaabbccc]
I don't need the java or python code, I just need the regexp. My incomplete one is:
something\s\(.+\)
something\s\((.+)\)
In this regex you are capturing the string containing all the variables. split it based on whitespace since you are sure that they are delimited by whitespace.
m = re.search('something\s\((.+)\)', input_string)
if m:
list_of_vars = m.group(1).split()
My String is huge and it will keep changing as I read each String in a loop. It can contain any characters like " , / , \ . $ ,? , [ , & , . , ' , ) , % , ^ , + , * etc. I would like to escape all such characters that might cause a regex to fail on this string in Java. Javascript has something like this in one of the posts which goes like this-
return str.replace(/[\-\[\]\/\{\}\(\)\*\+\?\.\\\^\$\|]/g, "\\$&");
Is there something similar for Java? I'm not sure what should be the character set to escape. Would something like str.replaceAll("[^\u0000-\u00ff]+", " ") do that? (But I'm losing data here if I'm replacing ALL of them with a space, which I want to avoid)
Use this:
String myEscapedString = Pattern.quote(myRawString);
I need valid regexp for email seperated by " " and ends with #a.com or b.com
for example:
valid email string: "email1#a.com email2#b.com email3#a.com"
invalid email string: "email1#a.com email2#b.com email3#c.com"
I don't necessarily think a regexp is an extensible and maintainable solution here. I would rather:
split the list on whitespace (perhaps on whitespace preceeded by a .com/.org etc.)
extract the domain name post-#
compare this vs. a whitelist (or blacklist)
I like regexps a lot, but I don't always think they're the solution. See here for a discussion on this, and note the below!
Some people, when confronted with a problem, think "I know, I'll use
regular expressions." Now they have two problems.
You can try this expression:
^(( |^)[^ #]+#[ab]\.com)+$
// ^ ^ ^ ^ ^
// | | | | +- The mandatory .com
// | | | +------ Either a or b
// | | +--------- An # sign
// | +------------- Anything but space or # repeated at least once
// +----------------------- Preceded by a space or the beginning of line
Try this:
^(.#(a|b).com(|\s)$
Permalink - try entering an invalid string like "c.com" and see that it works too
Regexpal is a nice easy tool to start working on making regex for whatever problem you are trying to solve!
(email[1-3]\#[ab].com )*email[1-3]\#[ab].com ?
(replace [1-3] and [ab] with whatever really suits you).
[A-Za-z0-9_.-]+#[ab]\.com( [A-Za-z0-9_.-]+#[ab]\.com)*
You can change the [A-Za-z0-9_.-]+ part if you want to be more restrictive.