Regex to match a text with special chars in it - java

I need a regex to match a text with special chars -,.+\/& in it. The special chars must not be more than 2 subsequent and a special char can not be followed by space. More specifically I have to cover these cases:
some text/
/some text
some /text
I came up with this regex:
^[-\/,\.+\&]{0,1}[\p{L}]+[-\/,\.+\&]{0,1}([\s\-']?[-\/,\.+\&]{0,1}[\p{L}]+)([-\/,\.+\&]{0,1})$
It matches most of the cases that I need but fails to match for instance:
some te&xt. Every help will be appreciated. Thanks.

You can use
"^(?!.*(?:[-,.+/&]\\s|[-,.+/&]{2}))[^\\s\\d]+(?:\\s+[^\\s\\d]+)*$"
See the regex demo
Explanation:
^ - start of string
(?!.*(?:[-,.+/&]\\s|[-,.+/&]{2})) - a negative lookahead that fails the match if there is a special char [-,.+/&] followed with a whitespace \s, or 2 consecutive special chars from [-,.+/&] set
[^\\s\\d]+ - 1 or more characters other than digit and whitespace
(?:\\s+[^\\s\\d]+)* - 0+ sequences of:
\\s+ - 1+ whitespaces
[^\\s\\d]+ - 1 or more characters other than digit and whitespace
$ - end of string

I found the solution:
^[-\/,\.+\&\s]{0,1}([\p{L}][-\/,\.+\&\s]{0,1})+([-\/,\.+\&\s]{0,1}([\p{L}][-\/,\.+\&\s]{0,1})+)([\p{L}][-\/,\.+\&\s]{0,1})([-\/,\.+\&\s]{0,1})$

Related

Plus quantifier does not check the last character if last character has negated set

I need to match a string with the following constraints:
At least one alphanumeric character
Forbid specific characters (^*#:;)
Forbid dot at the end
I have the next pattern:
^[^*#:;]*[\p{Alnum}]+[^*#:;]*[^.*#:;]$
The problem is that when I have an alphanumeric character at the end, the string will not match the pattern.
For example:
$$$....1$ will match the pattern.
$$$....$1 will not.
As far as I understand, the problem is that [\p{Alnum}]+ does not check the last character.
Is there any possible way to do this with one regexp?
It seems the following should tick your boxes:
^(?=.*\p{Alnum})(?!.*[*#:;]).+(?<!\.)$
Where:
^ - Start string anchor.
(?=.*\p{Alnum}) - Postive lookahead to match at least a single alphanumeric character.
(?!.*[*#:;]) - Negative lookahead to prevent any of the characters mentioned in the character class.
.+ - 1+ characters other than newline.
(?<!\.) - Negative lookbehind to prevent a dot before;
$ - End string anchor.
See the online demo
Alternatively use a negated character class as you were doing instead of the negative lookahead:
^(?=.*\p{Alnum})[^*#:;\n]+(?<!\.)$
^ - Start string anchor.
(?=.*\p{Alnum}) - Postive lookahead to match at least a single alphanumeric character.
[^*#:;\n]+ - 1+ characters other than those mentioned in the character class.
(?<!\.) - Negative lookbehind to prevent a dot before;
$ - End string anchor.
See the online demo

Regular expression for a string of two parts separated by dot

I need to write a regular expression for a string of two parts which is separated by '.' Here below are the condition,
<<1st part>>.<<2nd part>> : Example- Time01.Sheet
1st part should contain alpha numeric characters and it must contain at least 1 uppercase alphabet, 1 lowercase alphabet, and 1 number.
2nd part should contain alpha numeric characters.
My code : ((?=.*[a-z])(?=.*[A-Z])(?=.*[0-9])(?=\\S+$).*)[.]([\\w]+)$
Input : Vijay.hello876IUY
Actual Output : Valid data
Expected Output : Invalid data (Because 1st part doesn’t contain any numbers)
Any one please help me to solve this...
You may use
^(?=[^.]*[a-z])(?=[^.]*[A-Z])(?=[^.]*[0-9])[a-zA-Z0-9]+\.[a-zA-Z0-9]+$
See the regex demo.
Details
^ - start of string
(?=[^.]*[a-z]) - there must be a lowercase ASCII letter after 0+ chars other than .
(?=[^.]*[A-Z]) - there must be an uppercase ASCII letter after 0+ chars other than .
(?=[^.]*[0-9]) - there must be a digit after 0+ chars other than .
[a-zA-Z0-9]+ - 1+ alphanumeric chars
\. - a dot
[a-zA-Z0-9]+ - 1+ alphanumeric chars
$ - end of string.
In Java:
s.matches("(?=[^.]*[a-z])(?=[^.]*[A-Z])(?=[^.]*[0-9])[a-zA-Z0-9]+\\.[a-zA-Z0-9]+")
Since matches() requires a full string match, you need no ^ at the beginning and $ anchor at the end.

Java regular expression for digits and dashes

I need a regular expression which matches lines with only 4(four) hyphens and 13 digits(0-9). The order is undefined.
I have regex like:
^([0-9\u2013-]{17})$
But, when I receive strings as
----123456789---- or 1-2-3-4-5-6-7-8-9
matching is true but it must be false for me.
Could you please explain what I need use in order to matches were only with strings like 123-345-565-45-67 or 123-1-34-5435-45- or ----1234567890123 etc?
Try this regex:
^(?=(?:[^-]*-){4}[^-]*$)(?=(?:\D*\d){13}\D*$).*$
Click for Demo
Explanation:
^ - asserts the start of the line
(?=(?:[^-]*-){4}[^-]*$) - positive lookahead to make sure that there are only 4 occurrences of - present in the string
(?=(?:\D*\d){13}\D*$) - positive lookahead to make sure that there are 13 occurrences of a digit present in the string
.* - once the above 2 lookaheads are satisified, match 0+ occurrences of any character except a newline character
$ - asserts the end of the line
Escape \ with another \ in JAVA

How to find special pattern via regex

I have following regex which doesn’t match two different strings.
Actual regex which finds AB-434. Which doesn’t match TEMS-54534.
([a-zA-Z][a-zA-Z0-9_]+-[1-9][0-9]*)([^.]|\.[^0-9]|\.$|$)
here is the sample inputs
TEMS-54534
TEMS-5453
TEMS-1233
TEMS-12
CB-213
CB-2135
CB-12
ABC-2223
ABC-223
ABC-12
You seem to be looking for a pattern that starts with 1 ASCII letter followed with 1 or more alphanumeric or underscore characters followed with a - followed with one or more digits not starting with 0.
You can use
^[a-zA-Z][a-zA-Z0-9_]+-[1-9][0-9]*$
or
^[a-zA-Z]\w+-(?!0)\d+$
See the regex demo (and another one).
Explanation:
^ - start of string
[a-zA-Z][a-zA-Z0-9_]+ / [a-zA-Z]\w+ - an ASCII letter followed with 1+ alphanumerics/underscore chars
- - a hyphen
[1-9][0-9]* / (?!0)\d+ - a digit from 1-9 range followed with 0+ nay digits (you can restrict it with {min,max} limiting quantifier if need be)
$ - end of string
More details:
[a-zA-Z0-9_] can be written as \w (if no Pattern.UNICODE_CHARACTER_CLASS is used)
In Java, do not forget to use double backslashes to escape metacharacters and shorthand character classes
If the pattern is used with String#matches(), the ^ at the start and $ at the end of the pattern are redundant.
And a Java demo:
List<String> strs = Arrays.asList("TEMS-54534","TEMS-5453","TEMS-1233","TEMS-12","CB-213",
"CB-2135","CB-12","ABC-2223","ABC-223","ABC-12");
for (String str : strs)
System.out.println(str.matches("[a-zA-Z]\\w+-(?!0)\\d+"));

How to match String only contain Alphanumeric characters, a dash and an underscore using Regex

All:
What I want to do is using Regex to match a string which only allow [A-Za-z0-9_-] and the format should be:
Started with only [A-Za-z0-9], and followed by [A-Za-z0-9_-]. There could be [_-] in the middle, but if there is any, it is only allowed once(both _ and - can exist, but each one only has one chance), and ended with [A-Za-z0-9].
I only know how to match Alphanumeric characters, a dash and an underscore, but have no idea how to limit their occurrence time.
Thanks
You can use negative lookahead:
^(?!.*(-[^-]*-|_[^_]*_))[A-Za-z0-9][\w-]*[A-Za-z0-9]$
RegEx Demo
Explanation:
^ - Line start
(?!.*(-[^-]*-|_[^_]*_)) - Negative lookahead which means fail the match if there are 2 underscore or 2 hyphens ahead
[A-Za-z0-9] - Match 1 alphanumeric character
[\w-]* - Match 0 or more of [A-Za-z0-9_-] characters
$ - Match line end

Categories