A period must not appear consecutively in a String Java - java

I have a code check if the user input is valid in the regular expression pattern. The patter is # the problem is how to check if the character . appears consecutively
[a-z|A-Z|0-9|[.]{1}]+#[[a-z|A-Z|0-9]+
i've tried this patter so far.
System.out.print("Enter your Email: ");
String userInput = new Scanner(System.in).nextLine();
Pattern pat = Pattern.compile("[a-z|A-Z|0-9|[.]{1}]+#[a-z|A-Z|0-9]+");
Matcher mat = pat.matcher(userInput);
if(mat.matches()){
System.out.print("Valid");
}else{
System.out.print("Invalid");
}
}
}
if the input is een..123#asd123
I expect the output will Invalid but if the input is een.123#asd123 the output will Valid

A character class matches any of the listed characters. If you specify a pipe | that does not mean OR but it could then also match a |.
If you don't want to match consecutive dots, you could make use of a character class that does not contain a dot, and then use a quantifier to repeat a grouping structure that does start with a dot.
^[a-zA-Z0-9]+(?:\.[a-zA-Z0-9]+)*#[a-zA-Z0-9]+(?:\.[a-zA-Z0-9]+)*$
That will match
^ Start of string
[a-zA-Z0-9]+ Match 1+ times any character that is listed in the charater class
(?:\.[a-zA-Z0-9]+)* Repeat 0+ times a group which starts with a dot and matches 1+ times what is listed in the character class to prevent consecutive dots.
# Match # char
[a-zA-Z0-9]+ Match again 1+ chars
(?:\.[a-zA-Z0-9]+)* Match again repeating group
$ End of string
Regex101 demo

If you don't want consecutive periods, use [a-zA-Z0-9]+(?:\.[a-zA-Z0-9]+)*
Explanation
[a-zA-Z0-9]+ Match one or more letters or digits
(?: Start non-capturing group
\. Match exactly one period
[a-zA-Z0-9]+ Match one or more letters or digits
)* End optional repeating group
With this pattern, the value cannot start or end with a period, and cannot have consecutive periods.
Alternatively, use a negative lookahead: (?!.*[.]{2})[a-zA-Z0-9.]+#[a-zA-Z0-9]+
Explanation
(?!.*[.]{2}) Fail match if 2 consecutive periods are found
[a-zA-Z0-9.]+#[a-zA-Z0-9]+ Match normally

Related

REGEX - Allowing multiple spaces and cannot have spaces before or after hyphen(-)

For name validation using the Regex, I need to have the below requirements.
Can include letters, spaces, apostrophe(') and hyphen(-) only
multiple spaces are allowed between the characters (example: " asd asd asd asd ")
Cannot have spaces before or after hyphen(-) (example: "abc-abc" is allowed, "abc -abc" and "abc- abc" is not allowed)
I'm currently using this Regex:
(^[a-zA-Z' ]+(?:[- ][a-zA-Z']+)*$
(https://regexr.com/6umck)
This regex meets every requirement except the fact that it's allowing spaces before hyphen.
How can I disallow spaces before hyphen meeting every other requirement too here?
You may use this regex:
^ *[a-zA-Z']+(?:(?:-| +)[a-zA-Z']+)* *$
RegEx Demo
RegEx Breakup:
^: Start
*: Match 0 or more leading spaces
[a-zA-Z']+: Match 1+ of letter or '
(?:: Start non-capture group
(?:-| +): Match a - or 1+ spaces
[a-zA-Z']+: Match 1+ of letter or '
)*: End non-capture group. Repeat this group 0 or more times
*: Match 0 or more trailing spaces
$: End
You can exclude a space hyphen or hyphen space from matching.
Note that your pattern can also match just a space as the space is in the character class.
^(?!.*(?: -|- ))[a-zA-Z' ]+(?:[- ][a-zA-Z']+)*$
Regex demo

How would I match regex with a String of unknown length, but in a specific pattern?

I'm having trouble figuring out what's wrong with my regex for a string that can be any length but always follows the pattern below
"(A,B)" = valid
"(A,B) (B,C)" = valid
Basically, parenthesis "(", A-Z letter, comma, A-Z letter, parenthesis ")", a white space if and only if there the sequence begins again, otherwise no whitespace.
So far I've gotten to "^\\([A-Z]?,[A-Z]?\\)*.\\([A-Z]?,[A-Z]?\\).*$" but I'm not sure what about my regex is not working right.
You can use the regex,
^(?:\s?\([A-Z],[A-Z]\)(?:\s\([A-Z],[A-Z]\))?)*$
Demo
Description:
^: Asserts position at start of a line
(?:: Start of non-capturing group
\s?: Optional whitespace character
\(: The character (
[A-Z]: The character, A to Z
,: The character ,
[A-Z]: The character, A to Z
\): The character (
(?:: Start of non-capturing group
\s\([A-Z],[A-Z]\): Pattern already explained above
): End of non-capturing group
?: Makes the last non-capturing group optional
): End of non-capturing group
*: Matches the previous token between zero and unlimited times
$: Asserts position at the end of a line
A better solution (courtesy: The fourth bird):
You can also optionally repeat n groups after the first match i.e.
^\([A-Z],[A-Z]\)(?: \([A-Z],[A-Z]\))*$
Demo
Try this one..
([(]([A-Z])\,([A-Z])[)])(\s[(]([A-Z])\,([A-Z])[)])*
final String regex = "([(]([A-Z])\\,([A-Z])[)])(\\s[(]([A-Z])\\,([A-Z])[)])*";

Java regex: find sequence of letter-digit combinations, allowing certain symbols

I am trying to arrive at a regex to detect tokens from a sentence. These tokens should be a combination of letters and digits (mandatory), with optional chars like , or .
Given the sentence:
M5 x 35mm Full Thread Hexagon Bolts (DIN 933) - PEEK DescriptionThe M5 x 0.035mm, and 6NB7 plus a Go9IuN.
It should find six tokens:
M5, 35mm, M5, 0.035mm, 6NB7, Go9IuN
I have tried the following which does not work:
Pattern alphanum=Pattern.compile("\\b(([A-Za-z].*[0-9])|([0-9].*[A-Za-z]))\\b");
Any suggestions please?
Thanks
You could use a positive lookahead to assert at least 1 digit and then match at least 1 char a-zA-Z
The .* part will over match as it will match any char 0+ times except a newline
\b(?=[a-zA-Z0-9.,]*[0-9])[a-zA-Z0-9.,]*[a-zA-Z][a-zA-Z0-9.,]*\b
Explanation
\b Word boundary
(?=[a-zA-Z0-9.,]*[0-9]) Assert at least 1 digit
[a-zA-Z0-9.,]*[a-zA-Z][a-zA-Z0-9.,]* Match at least 1 char a-zA-Z
\b Word boundary
Regex demo
In Java
final String regex = "\\b(?=[a-zA-Z0-9.,]*[0-9])[a-zA-Z0-9.,]*[a-zA-Z][a-zA-Z0-9.,]*\\b";
Perhaps the following regex will do the job
(?=[A-Za-z,.]*\d)(?=[\d,.]*[A-Za-z])[A-Za-z\d,.]{2,}(?<![,.])
It starts with two positive lookaheads which form an and condition.
The first lookahead (?=[A-Za-z,.]*\d) checks if a token contains at least one digit.
The second lookahead (?=[\d,.]*[A-Za-z]) checks if it contains at least one letter.
The actual match [A-Za-z\d,.]{2,} reads at least two letters, digits, , or ..
In the end it checks that the match does not end with those special characters: (?<![,.])
regex101 demo

How to restrict occurrence of a character in regex?

I want to check if a string consists of letters and digits only, and allow a - separator:
^[\w\d-]*$
Valid: TEST-TEST123
Now I want to check that the separator occurs only once at a time. Thus the following examples should be invalid:
Invalid: TEST--TEST, TEST------TEST, TEST-TEST--TEST.
Question: how can I restrict the repeated occurrence of the a character?
You may use
^(?:[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*)?$
Or, in Java, you may use an alphanumeric \p{Alnum} character class to denote letters and digits:
^(?:\p{Alnum}+(?:-\p{Alnum}+)*)?$
See the regex demo
Details
^ - start of the string
(?: - start of an optional non-capturing group (it will ensure the pattern matches an empty string, if you do not need it, remove this group!)
\p{Alnum}+ - 1 or more letters or digits
(?:-\p{Alnum}+)* - zero or more repetitions of
- - a hyphen
\p{Alnum}+ - 1 or more letters or digits
)? - end of the optional non-capturing group
$ - end of string.
In code, you do not need the ^ and $ anchors if you use the pattern in the matches method since it anchors the match by default:
Boolean valid = s.matches("(?:\\p{Alnum}+(?:-\\p{Alnum}+)*)?");

How to match String only contain Alphanumeric characters, a dash and an underscore using Regex

All:
What I want to do is using Regex to match a string which only allow [A-Za-z0-9_-] and the format should be:
Started with only [A-Za-z0-9], and followed by [A-Za-z0-9_-]. There could be [_-] in the middle, but if there is any, it is only allowed once(both _ and - can exist, but each one only has one chance), and ended with [A-Za-z0-9].
I only know how to match Alphanumeric characters, a dash and an underscore, but have no idea how to limit their occurrence time.
Thanks
You can use negative lookahead:
^(?!.*(-[^-]*-|_[^_]*_))[A-Za-z0-9][\w-]*[A-Za-z0-9]$
RegEx Demo
Explanation:
^ - Line start
(?!.*(-[^-]*-|_[^_]*_)) - Negative lookahead which means fail the match if there are 2 underscore or 2 hyphens ahead
[A-Za-z0-9] - Match 1 alphanumeric character
[\w-]* - Match 0 or more of [A-Za-z0-9_-] characters
$ - Match line end

Categories