This task is formulated on the Codingbat. (https://codingbat.com/prob/p194491):
Returns true if for every '*' (star) in the string, if there are chars both immediately before and after the star, they are the same.
sameStarChar("xy*yzz") → true
sameStarChar("xy*zzz") → false
sameStarChar("*xa*az") → true
How can I fix my solution in order to solve this task using regular expressions?
My attempt:
public boolean sameStarChar(String str) {
return str.matches(".*([*])*.*");
}
My solution is correct for more than half the tests, but not for all.
It can be done but it is quite complicated. Your regexp just scans for 'anything, then any number of stars, then anything - which matches everything. Your code returns true for literally every string imaginable.
Your approach is problematic. Trying to positively match is quite complicated. The question is asking the negative: "This construct is invalid; any string that doesn't have an invalid construct is valid" is what it boils down to, with the invalid construct being: "A*B", where A and B are not identical. After all, *a is valid (example 3). Presumably *** is also valid (the first and last star aren't a problem because they don't have characters on both sides, the middle one is fine because the character on each side is identical).
Thus, what you want is to write a regexp that finds the invalid construct, and return the inverse.
To find the invalid construct you need something called the backreference: You want to search for a thing and then refer to it.
".\*." - we start here: A character, a star, and a character. But now we need for the second character (the second . to actually be: Something OTHER than the first character). After all, ".\*." would also match on "A*A" and that is a valid construct so we don't want it to match.
Enter backrefs. () in regexpese makes a 'group' - a thing you can refer to later. \1 is a backref - it's "whatever matched for the first set of parentheses".
But we need more - we need to negate: Match if NOT this. Within chargroups there's ^ - [^fo] means: "Any character that is NOT an 'f' or an 'o'". But a backref isn't a chargroup.
As per this SO question backing me up on this, the only way is negative lookahead. Lookahead is a thing where you don't actually match characters, you merely check if they WOULD match, and if it would, fail the match. It's.. complicated. Search the web for tutorials that explain 'positive lookahead' and 'negative lookahead'.
Thus:
Pattern p = Pattern.compile("(.)\\*(?!\\1|$)");
return !p.matcher(str).find();
All sorts of things going on here:
(?!X) is negative lookahead.
\1|$ means: Either 'group 1' or 'end of string'. Given that our input contains X*, the next thing after that star must either be an X or the end of the string - if it is anything else, we should return false.
We don't want to match the entire string. We just want to ask: Is the 'invalid construct' anywhere in this string? - hence, find(), not matches().
To be clear, using regexp for this is probably a bad idea. Sure, the code will be extremely short, but it's not exactly readable, is it.
Without regexps, it becomes much easier to follow:
for (int i = 1; i < str.length() -1; i++) {
if (str.charAt(i) != '*') continue;
if (str.charAt(i - 1) != str.charAt(i + 1)) return false;
}
return true;
I'd strongly prefer the above over a regexp that doesn't readily show what it is actually accomplishing, and this regexp certainly doesn't make it remotely feasible to understand what it does just by looking at it.
You can use
public boolean sameStarChar(String str) {
return str.matches("^(?!.*(.)\\*(?!\\1|$)).*");
}
Details:
^ - start of string
(?!.*(.)\*(?!\1|$)) - a negative lookahead that fails the match if there are
.* - any zero or more chars other than line break chars as many as possible
(.) - Group 1 (\1): any one char other than line break chars
\* - an asterisk
(?!\1|$) - not immediately followed with the same value as in Group 2 or end of string
.* - the rest of the string is consumed (as .matches requires a full string match).
I am trying to write a regex which should return true, if [A-Za-z] is occured between 1 and 3, but I am not able to do this
public static void main(String[] args) {
String regex = "(?:([A-Za-z]*){3}).*";
String regex1 = "(?=((([A-Za-z]){1}){1,3})).*";
Pattern pattern = Pattern.compile(regex);
System.out.println(pattern.matcher("AD1CDD").find());
}
Note: for consecutive 3 characters I am able to write it, but what I want to achieve is the occurrence should be between 1 and 3 only for the entire string. If there are 4 characters, it should return false. I have used look-ahead to achieve this
If I understand your question correctly, you want to check if
1 to 3 characters of the range [a-zA-Z] are in the string
Any other character can occur arbitrary often?
First of all, just counting the characters and not using a regular expression is more efficient, as this is not a regular language problem, but a trivial counting problem. There is nothing wrong with using a for loop for this problem (except that interpreters such as Python and R can be fairly slow).
Nevertheless, you can (ab-) use extended regular expressions:
^([^A-Za-z]*[A-Za-z]){1,3}[^A-Za-z]*$
This is fairly straightforward, once you also model the "other" characters. And that is what you should do to define a pattern: model all accepted strings (i.e. the entire "language"), not only those characters you want to find.
Alternatively, you can "findAll" matches of ([A-Za-z]), and look at the length of the result. This may be more convenient if you also need the actual characters.
The for loop would look something like this:
public static boolean containsOneToThreeAlphabetic(String str) {
int matched = 0;
for(int i=0; i<str.length; i++) {
char c = str.charAt(i);
if ((c>='A' && c<='Z') || (c>='a' && c<='z')) matched++;
}
return matched >=1 && matched <= 3;
}
This is straightforward, readable, extensible, and efficient (in compiled languages). You can also add a if (matched>=4) return false; (or break) to stop early.
Please, stop playing with regex, you'll complicate not only your own life, but the life of the people, who have to handle your code in the future. Choose a simpler approach, find all [A-Za-z]+ strings, put them into the list, then check every string, if the length is within 1 and 3 or beyond that.
Regex
/([A-Za-z])(?=(?:.*\1){3})/s
Looking for a char and for 3 repetitions of it. So if it matches there are 4 or more equal chars present.
I need a regular expression to evaluate if the first character of a word is a lowercase letter or not.
I have this java code: Character.toString(charcter).matches("[a-z?]")
For example if I have those words the result would be:
a13 => true
B54 => false
&32 => false
I want to match only one letter and I don't know if I need to use "?", "." or "{1}" after or inside "[a-z]"
There is a built in way to do this without regexes.
Character.isLowerCase(string.charAt(0))
Please use this for your needs: /^[a-z]/
You want to match if there's exactly one lowercase letter. As #Luiggi Medonza stated, you really do/should not need Regular Expressions for this, but if you want to use them, you most likely want this pattern:
[a-z]{1}
What ? does is an optional match. You want a strict match of length 1, so you need {1}.
#Ted Hopp mentioned that you don't need the {1}. Your entire match should look like this:
entire_string.matches("^[a-z].+$")
Again, using built-in string methods will be much faster/better to use.
Here I got similar requirement like in a string first character should alphabet from a-z or A-Z. than the user can type anything like number or some limited symbols.
Solution
public static boolean designationValidate(String n) {
int l = n.length();
if (l >= 4) {
Pattern pattern = Pattern.compile("^[a-zA-Z][a-zA-Z0-9-() ]*$");
Matcher matcher = pattern.matcher(n);
return (matcher.find() && matcher.group().equals(n));
} else
return false;
}
in above example I am validation minimum character should more than 3 length and start with alphabet. If you want any other symbols you can enter there.
The method will return true if expressions match otherwise return false.
May this will helpful for you.
I have a textbox where I get the last name of a user. How do I allow only one dash (-) in a regular expression? And it's not supposed to be in the beginning or at the end of the string.
I have this code:
Pattern p = Pattern.compile("[^a-z-']", Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(name);
Try to rephrase the question in more regexy terms. Rather than "allow only one dash, and it can't be at the beginning" you could say, "the string's beginning, followed by at least one non-dash, followed by one dash, followed by at least one non-dash, followed by the string's end."
the string's beginning: `^
at least one non-dash: [^-]+
followed by one dash: -
followed by at least one non-dash: [^-]+
followed by the string's end: $
Put those all together, and there you go. If you're using this in a context that matches against the complete string (not just any substring within it), you don't need the anchors -- though it may be good to put them in anyway, in case you later use that regex in a substring-matching context and forget to add them back in.
Why not just use indexOf() in String?
String s = "last-name";
int first = s.indexOf('-');
int last = s.lastIndexOf('-');
if(first == 0 || last == s.length()-1) // Checks if a dash is at the beginning or end
System.out.println("BAD");
if(first != last) // Checks if there is more than one dash
System.out.println("BAD");
It is slower than using regex but with usually small size of last names it should not be noticeable in the least bit. Also, it will make debugging and future maintenance MUCH easier.
It looks like your regex represents a fragment of an invalid value, and you're presumably using Matcher.find() to find if any part of your value matches that regex. Is that correct? If so, you can change your pattern to:
Pattern p = Pattern.compile("[^a-zA-Z'-]|-.*-|^-|-$");
which will match a non-letter-non-hyphen-non-apostrophe character, or a sequence of characters that both starts and ends with hyphens (thereby detecting a value that contains two hyphens), or a leading hyphen, or a trailing hyphen.
This regex represents one or more non-hyphens, followed by a single hyphen, followed by one or more non-hyphens.
^[^\-]+\-[^\-]+$
I'm not sure if the hyphen in the middle needs to be escaped with a backslash... That probably depends on what platform you're using for regex.
Try pattern something like [a-z]-[a-z].
Pattern p = Pattern.compile("[a-z]-[a-z]");
I'm creating a regexp for password validation to be used in a Java application as a configuration parameter.
The regexp is:
^.*(?=.{8,})(?=..*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[##$%^&+=]).*$
The password policy is:
At least 8 chars
Contains at least one digit
Contains at least one lower alpha char and one upper alpha char
Contains at least one char within a set of special chars (##%$^ etc.)
Does not contain space, tab, etc.
I’m missing just point 5. I'm not able to have the regexp check for space, tab, carriage return, etc.
Could anyone help me?
Try this:
^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[##$%^&+=])(?=\S+$).{8,}$
Explanation:
^ # start-of-string
(?=.*[0-9]) # a digit must occur at least once
(?=.*[a-z]) # a lower case letter must occur at least once
(?=.*[A-Z]) # an upper case letter must occur at least once
(?=.*[##$%^&+=]) # a special character must occur at least once
(?=\S+$) # no whitespace allowed in the entire string
.{8,} # anything, at least eight places though
$ # end-of-string
It's easy to add, modify or remove individual rules, since every rule is an independent "module".
The (?=.*[xyz]) construct eats the entire string (.*) and backtracks to the first occurrence where [xyz] can match. It succeeds if [xyz] is found, it fails otherwise.
The alternative would be using a reluctant qualifier: (?=.*?[xyz]). For a password check, this will hardly make any difference, for much longer strings it could be the more efficient variant.
The most efficient variant (but hardest to read and maintain, therefore the most error-prone) would be (?=[^xyz]*[xyz]), of course. For a regex of this length and for this purpose, I would dis-recommend doing it that way, as it has no real benefits.
simple example using regex
public class passwordvalidation {
public static void main(String[] args) {
String passwd = "aaZZa44#";
String pattern = "(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[##$%^&+=])(?=\\S+$).{8,}";
System.out.println(passwd.matches(pattern));
}
}
Explanations:
(?=.*[0-9]) a digit must occur at least once
(?=.*[a-z]) a lower case letter must occur at least once
(?=.*[A-Z]) an upper case letter must occur at least once
(?=.*[##$%^&+=]) a special character must occur at least once
(?=\\S+$) no whitespace allowed in the entire string
.{8,} at least 8 characters
All the previously given answers use the same (correct) technique to use a separate lookahead for each requirement. But they contain a couple of inefficiencies and a potentially massive bug, depending on the back end that will actually use the password.
I'll start with the regex from the accepted answer:
^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[##$%^&+=])(?=\S+$).{8,}$
First of all, since Java supports \A and \z I prefer to use those to make sure the entire string is validated, independently of Pattern.MULTILINE. This doesn't affect performance, but avoids mistakes when regexes are recycled.
\A(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[##$%^&+=])(?=\S+$).{8,}\z
Checking that the password does not contain whitespace and checking its minimum length can be done in a single pass by using the all at once by putting variable quantifier {8,} on the shorthand \S that limits the allowed characters:
\A(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[##$%^&+=])\S{8,}\z
If the provided password does contain a space, all the checks will be done, only to have the final check fail on the space. This can be avoided by replacing all the dots with \S:
\A(?=\S*[0-9])(?=\S*[a-z])(?=\S*[A-Z])(?=\S*[##$%^&+=])\S{8,}\z
The dot should only be used if you really want to allow any character. Otherwise, use a (negated) character class to limit your regex to only those characters that are really permitted. Though it makes little difference in this case, not using the dot when something else is more appropriate is a very good habit. I see far too many cases of catastrophic backtracking because the developer was too lazy to use something more appropriate than the dot.
Since there's a good chance the initial tests will find an appropriate character in the first half of the password, a lazy quantifier can be more efficient:
\A(?=\S*?[0-9])(?=\S*?[a-z])(?=\S*?[A-Z])(?=\S*?[##$%^&+=])\S{8,}\z
But now for the really important issue: none of the answers mentions the fact that the original question seems to be written by somebody who thinks in ASCII. But in Java strings are Unicode. Are non-ASCII characters allowed in passwords? If they are, are only ASCII spaces disallowed, or should all Unicode whitespace be excluded.
By default \s matches only ASCII whitespace, so its inverse \S matches all Unicode characters (whitespace or not) and all non-whitespace ASCII characters. If Unicode characters are allowed but Unicode spaces are not, the UNICODE_CHARACTER_CLASS flag can be specified to make \S exclude Unicode whitespace. If Unicode characters are not allowed, then [\x21-\x7E] can be used instead of \S to match all ASCII characters that are not a space or a control character.
Which brings us to the next potential issue: do we want to allow control characters? The first step in writing a proper regex is to exactly specify what you want to match and what you don't. The only 100% technically correct answer is that the password specification in the question is ambiguous because it does not state whether certain ranges of characters like control characters or non-ASCII characters are permitted or not.
You should not use overly complex Regex (if you can avoid them) because they are
hard to read (at least for everyone but yourself)
hard to extend
hard to debug
Although there might be a small performance overhead in using many small regular expressions, the points above outweight it easily.
I would implement like this:
bool matchesPolicy(pwd) {
if (pwd.length < 8) return false;
if (not pwd =~ /[0-9]/) return false;
if (not pwd =~ /[a-z]/) return false;
if (not pwd =~ /[A-Z]/) return false;
if (not pwd =~ /[%#$^]/) return false;
if (pwd =~ /\s/) return false;
return true;
}
Thanks for all answers, based on all them but extending sphecial characters:
#SuppressWarnings({"regexp", "RegExpUnexpectedAnchor", "RegExpRedundantEscape"})
String PASSWORD_SPECIAL_CHARS = "##$%^`<>&+=\"!ºª·#~%&'¿¡€,:;*/+-.=_\\[\\]\\(\\)\\|\\_\\?\\\\";
int PASSWORD_MIN_SIZE = 8;
String PASSWORD_REGEXP = "^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[" + PASSWORD_SPECIAL_CHARS + "])(?=\\S+$).{"+PASSWORD_MIN_SIZE+",}$";
Unit tested:
Password Requirement :
Password should be at least eight (8) characters in length where the system can support it.
Passwords must include characters from at least two (2) of these groupings: alpha, numeric, and special characters.
^.*(?=.{8,})(?=.*\d)(?=.*[a-zA-Z])|(?=.{8,})(?=.*\d)(?=.*[!##$%^&])|(?=.{8,})(?=.*[a-zA-Z])(?=.*[!##$%^&]).*$
I tested it and it works
For anyone interested in minimum requirements for each type of character, I would suggest making the following extension over Tomalak's accepted answer:
^(?=(.*[0-9]){%d,})(?=(.*[a-z]){%d,})(?=(.*[A-Z]){%d,})(?=(.*[^0-9a-zA-Z]){%d,})(?=\S+$).{%d,}$
Notice that this is a formatting string and not the final regex pattern. Just substitute %d with the minimum required occurrences for: digits, lowercase, uppercase, non-digit/character, and entire password (respectively). Maximum occurrences are unlikely (unless you want a max of 0, effectively rejecting any such characters) but those could be easily added as well. Notice the extra grouping around each type so that the min/max constraints allow for non-consecutive matches. This worked wonders for a system where we could centrally configure how many of each type of character we required and then have the website as well as two different mobile platforms fetch that information in order to construct the regex pattern based on the above formatting string.
This one checks for every special character :
^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=\S+$).*[A-Za-z0-9].{8,}$
Java Method ready for you, with parameters
Just copy and paste and set your desired parameters.
If you don't want a module, just comment it or add an "if" as done by me for special char
//______________________________________________________________________________
/**
* Validation Password */
//______________________________________________________________________________
private static boolean validation_Password(final String PASSWORD_Arg) {
boolean result = false;
try {
if (PASSWORD_Arg!=null) {
//_________________________
//Parameteres
final String MIN_LENGHT="8";
final String MAX_LENGHT="20";
final boolean SPECIAL_CHAR_NEEDED=true;
//_________________________
//Modules
final String ONE_DIGIT = "(?=.*[0-9])"; //(?=.*[0-9]) a digit must occur at least once
final String LOWER_CASE = "(?=.*[a-z])"; //(?=.*[a-z]) a lower case letter must occur at least once
final String UPPER_CASE = "(?=.*[A-Z])"; //(?=.*[A-Z]) an upper case letter must occur at least once
final String NO_SPACE = "(?=\\S+$)"; //(?=\\S+$) no whitespace allowed in the entire string
//final String MIN_CHAR = ".{" + MIN_LENGHT + ",}"; //.{8,} at least 8 characters
final String MIN_MAX_CHAR = ".{" + MIN_LENGHT + "," + MAX_LENGHT + "}"; //.{5,10} represents minimum of 5 characters and maximum of 10 characters
final String SPECIAL_CHAR;
if (SPECIAL_CHAR_NEEDED==true) SPECIAL_CHAR= "(?=.*[##$%^&+=])"; //(?=.*[##$%^&+=]) a special character must occur at least once
else SPECIAL_CHAR="";
//_________________________
//Pattern
//String pattern = "(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[##$%^&+=])(?=\\S+$).{8,}";
final String PATTERN = ONE_DIGIT + LOWER_CASE + UPPER_CASE + SPECIAL_CHAR + NO_SPACE + MIN_MAX_CHAR;
//_________________________
result = PASSWORD_Arg.matches(PATTERN);
//_________________________
}
} catch (Exception ex) {
result=false;
}
return result;
}
Also You Can Do like This.
public boolean isPasswordValid(String password) {
String regExpn =
"^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[##$%^&+=])(?=\\S+$).{8,}$";
CharSequence inputStr = password;
Pattern pattern = Pattern.compile(regExpn,Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(inputStr);
if(matcher.matches())
return true;
else
return false;
}
Use Passay library which is powerful api.
I think this can do it also (as a simpler mode):
^(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[##$%^&+=])[^\s]{8,}$
[Regex Demo]
easy one
("^ (?=.* [0-9]) (?=.* [a-z]) (?=.* [A-Z]) (?=.* [\\W_])[\\S]{8,10}$")
(?= anything ) ->means positive looks forward in all input string and make sure for this condition is written .sample(?=.*[0-9])-> means ensure one digit number is written in the all string.if not written return false
.
(?! anything ) ->(vise versa) means negative looks forward if condition is written return false.
close meaning ^(condition)(condition)(condition)(condition)[\S]{8,10}$
String s=pwd;
int n=0;
for(int i=0;i<s.length();i++)
{
if((Character.isDigit(s.charAt(i))))
{
n=5;
break;
}
else
{
}
}
for(int i=0;i<s.length();i++)
{
if((Character.isLetter(s.charAt(i))))
{
n+=5;
break;
}
else
{
}
}
if(n==10)
{
out.print("Password format correct <b>Accepted</b><br>");
}
else
{
out.print("Password must be alphanumeric <b>Declined</b><br>");
}
Explanation:
First set the password as a string and create integer set o.
Then check the each and every char by for loop.
If it finds number in the string then the n add 5. Then jump to the
next for loop. Character.isDigit(s.charAt(i))
This loop check any alphabets placed in the string. If its find then
add one more 5 in n. Character.isLetter(s.charAt(i))
Now check the integer n by the way of if condition. If n=10 is true
given string is alphanumeric else its not.
Sample code block for strong password:
(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[^a-zA-Z0-9])(?=\\S+$).{6,18}
at least 6 digits
up to 18 digits
one number
one lowercase
one uppercase
can contain all special characters
RegEx is -
^(?:(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[##$%^&+=]).*)[^\s]{8,}$
at least 8 digits {8,}
at least one number (?=.*\d)
at least one lowercase (?=.*[a-z])
at least one uppercase (?=.*[A-Z])
at least one special character (?=.*[##$%^&+=])
No space [^\s]
A more general answer which accepts all the special characters including _ would be slightly different:
^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[\W|\_])(?=\S+$).{8,}$
The difference (?=.*[\W|\_]) translates to "at least one of all the special characters including the underscore".