Regex for password matching - java

I have searched the site and not finding exactly what I am looking for.
Password Criteria:
Must be 6 characters, 50 max
Must include 1 alpha character
Must include 1 numeric or special character
Here is what I have in java:
public static Pattern p = Pattern.compile(
"((?=.*\\d)(?=.*[a-z])(?=.*[A-Z])|(?=.*[\\d~!##$%^&*\\(\\)_+\\{\\}\\[\\]\\?<>|_]).{6,50})"
);
The problem is that a password of 1234567 is matching(it is valid) which it should not be.
Any help would be great.

I wouldn't try to use a single regular expression to do that. Regular expressions tend not to perform well when they get long and complicated.
boolean valid(String password){
return password != null &&
password.length() >= 6 &&
password.length() <= 50 &&
password.matches(".*[A-Za-z].*") &&
password.matches(".*[0-9\\~\\!\\#\\#\\$\\%\\^\\&\\*\\(\\)_+\\{\\}\\[\\]\\?<>|_].*");
}

Make sure you use Matcher.matches() method, which assert that the whole string matches the pattern.
Your current regex:
"((?=.*\\d)(?=.*[a-z])(?=.*[A-Z])|(?=.*[\\d~!##$%^&*\\(\\)_+\\{\\}\\[\\]\\?<>|_]).{6,50})"
means:
The string must contain at least a digit (?=.*\\d), a lower case English alphabet (?=.*[a-z]), and an upper case character (?=.*[A-Z])
OR | The string must contain at least 1 character which may be digit or special character (?=.*[\\d~!##$%^&*\\(\\)_+\\{\\}\\[\\]\\?<>|_])
Either conditions above holds true, and the string must be between 6 to 50 characters long, and does not contain any line separator.
The correct regex is:
"(?=.*[a-zA-Z])(?=.*[\\d~!##$%^&*()_+{}\\[\\]?<>|]).{6,50}"
This will check:
The string must contain an English alphabet character (either upper case or lower case) (?=.*[a-zA-Z]), and a character which can be either a digit or a special character (?=.*[\\d~!##$%^&*()_+{}\\[\\]?<>|])
The string must be between 6 and 50 characters, and does not contain any line separator.
Note that I removed escaping for most characters, except for [], since {}?() loses their special meaning inside character class.

A regular expression can only match languages which can be expressed as a deterministic finite automaton, i.e. which doesn't require memory. Since you have to count special and alpha characters, this does require memory, so you're not going to be able to do this in a DFA. Your rules are simple enough, though that you could just scan the password, determine its length and ensure that the required characters are available.

I'd suggest you to separate characters and length validation:
boolean checkPassword(String password) {
return password.length() >= 6 && password.length() <= 50 && Pattern.compile("\\d|\\w").matcher(password).find();
}

I would suggest splitting into separate regular expressions
$re_numbers = "/[0-9]/";
$re_letters = "/[a-zA-Z]/";
both of them must match and the length is tested separately, too.
The code looks quite cleaner then and is easier to understand/change.

This way too complex for such a simple task:
Validate length using String#length()
password.length() >= 6 && password.length() <= 50
Validate each group using Matcher#find()
Pattern alpha = Pattern.compile("[a-zA-Z]");
boolean hasAlpha = alpha.matcher(password).find();
Pattern digit = Pattern.compile("\d");
boolean hasDigit = digit.matcher(password).find();
Pattern special = Pattern.compile("[\\~\\!\\#\\#\\$\\%\\^\\&\\*\\(\\)_+\\{\\}\\[\\]\\?<>|_]");
boolean hasSpecial = special.matcher(password).find();

Related

Figuring out regex for the mentioned condition

I came across the concept of regex recently and was poised to solve the problem using just the regex inside matches() and length() method of String class. The problem was related to password matching.Here are the three conditions that need to be considered:
A password must have at least eight characters.
A password consists of only letters and digits.
A password must contain at least two digits.
I was able to do this problem by using various other String and Character class methods but I need to do them only by regex.What I have tried helps me with most of the test cases but some of them(test cases) are still failing.Since, I am learning regex implementation so please help me with what I am missing or doing wrong. Below is what I tried:
public class CheckPassword {
public static void main(String[]args){
Scanner sc = new Scanner(System.in);
System.out.println("Enter your password:\n");
String str1 = sc.next();
//String dig2 = "\\d{2}";
//String letter = ".*[A-Z].*";
//String letter1 = ".*[a-z].*";
//if(str1.length() >= 8 && str1.matches(dig2) &&(str1.matches(letter) || str1.matches(letter1)) )
if(str1.length() >= 8 && str1.matches("^(?=.*[A-Z])(?=.*[a-z])(?=.*\\d{2,})(?=.*[0-9])[A-Z0-9a-z]+$"))
System.out.println("Valid Password");
else
System.out.println("Invalid Password");
}
}
EDIT
Okay So I figured out the first and second case just I am having problem in appending the third case with them i.e. contains at least 2 digits.
if(str1.length() >= 8 && str1.matches("[a-zA-Z0-9]*"))
//works exclusive of the third criterion
You may actually use a single regex inside matches() to validate all 3 conditions:
A password must have at least eight characters and
A password consists of only letters and digits - use \p{Alnum}{8,} in the consuming part
A password must contain at least two digits - use the (?=(?:[a-zA-Z]*\d){2}) positive lookahead anchored at the start
Combining all three:
.matches("(?=(?:[a-zA-Z]*\\d){2})\\p{Alnum}{8,}")
Since matches() method anchors the pattern by default (i.e. it requires a full string match) no ^ and $ anchors are necessary.
Details
^ - implicit in matches() - start of string
(?=(?:[a-zA-Z]*\d){2}) - a positive lookahead ((?=...)) that requires the presence of exactly two sequences of:
[a-zA-Z]* - zero or more ASCII letters
\d - an ASCII digit
\p{Alnum}{8,} - 8 or more alphanumeric chars (ASCII only)
$ - implicit in matches() - end of string.
Okay Thank you #TDG and M.Aroosi for giving your precious time. I have figured out the solution and this solution satisfies all cases
// answer edited based on OP's working comment.
String dig2 = "^(?=.*?\\d.*\\d)[a-zA-Z0-9]{8,}$";
if(str1.matches(dig2))
{
//body
}

Regex for detecting repeating symbols

I'm looking for the regex expression that will detect repeating symbols in a String. And currently I didn't found solution that fits all my requirements.
Requirements are pretty simple:
detect any repeating symbol in a String;
to be able to setup repeating count (eg. more than twice)
Examples of required detection (of symbol 'a', more than 2 times, true if detects, false otherwise)
"Abcdefg" - false
"AbcdaBCD" - false
"abcd_ab_ab" - true (symbol 'a' used three times)
"aabbaabb" - true (symbols 'a' used four times)
Since I'm not a pro in regex and usage of them - code snippet and explanation would be appreciated!
Thanks!
I think that
(.).*\1
would work:
(.) match a single character and capture
.* match any intervening characters
\1 match the captured group again.
(You'd need to compile with the DOTALL flag, or replace . with [\s\S] or similar if the string contains characters not ordinarily matched by .)
and if you want to require that it is found at least 3 times, just change the quantifier of the second two bullets:
(.)(.*\1){2}
etc.
This is going to be pretty inefficient, though, because it's going to have to do the "search for the next matching character" between every character in the string and the end of the string, making it at least quadratic.
You might be as well off not using regular expressions, e.g.
char[] cs = str.toCharArray();
Arrays.sort(cs);
int n = numOccurrencesRequired - 1;
for (int i = n; i < cs.length; ++i) {
boolean allSame = true;
for (int j = 1; j <= n && allSame; ++j) {
allSame = cs[i] == cs[i - j];
}
if (allSame) return true;
}
return false;
This sorts all of the same characters together, allowing you just to pass over the string once looking for adjacent equal characters.
Note that this doesn't quite work for any symbol: it will split up multi-char codepoints like 🍕. You can adapt the code above to work with codepoints, rather than chars.
Try this regex: (.)(?:.*\1)
It basically matches any character (.) is followed by anything .* and itself \1. If you want to check for 2 or more repeats only add {n,} at the end with n being the number of repeats you want to check for.
Yea, such regex exists but just because the set of characters is finite.
regex: .*(a.*a|b.*b|c.*c|...|y.*y|z.*z).*
It makes no sense. Use another approach:
String string = "something";
int[] count = new int[256];
for (int i = 0; i < string.length; i++) {
int temp = int(string.charAt(i));
count[temp]++;
}
Now you have all characters counted and you can use them as you wish.

Java Regex hung on a long string

I am trying to write a REGEX to validate a string. It should validate to the requirement which is that it should have only Uppercase and lowercase English letters (a to z, A to Z) (ASCII: 65 to 90, 97 to 122) AND/OR Digits 0 to 9 (ASCII: 48 to 57) AND Characters - _ ~ (ASCII: 45, 95, 126). Provided that they are not the first or last character. It can also have Character. (dot, period, full stop) (ASCII: 46) Provided that it is not the first or last character, and provided also that it does not appear two or more times consecutively. I have tried using the following
Pattern.compile("^[^\\W_*]+((\\.?[\\w\\~-]+)*\\.?[^\\W_*])*$");
It works fine for smaller strings but it doesn't for long strings as i am experiencing thread hung issues and huge spikes in cpu. Please help.
Test cases for invalid strings:
"aB78."
"aB78..ab"
"aB78,1"
"aB78 abc"
".Abc12"
Test cases for valid strings:
"abc-def"
"a1b2c~3"
"012_345"
Your regex suffers from catastrophic backtracking, which leads to O(2n) (ie exponential) solution time.
Although following the link will provide a far more thorough explanation, briefly the problem is that when the input doesn't match, the engine backtracks the first * term to try different combinations of the quantitys of the terms, but because all groups more or less match the same thing, the number of combinations of ways to group grows exponentially with the length of the backtracking - which in the case of non- matching input is the entire input.
The solution is to rewrite the regex so it won't catastrophically backtrack:
don't use groups of groups
use possessive quantifiers eg .*+ (which never backtrack)
fail early on non-match (eg using an anchored negative look ahead)
limit the number of times terms may appear using {n,m} style quantifiers
Or otherwise mitigate the problem
Problem
It is due to catastrophic backtracking. Let me show where it happens, by simplifying the regex to a regex which matches a subset of the original regex:
^[^\W_*]+((\.?[\w\~-]+)*\.?[^\W_*])*$
Since [^\W_*] and [\w\~-] can match [a-z], let us replace them with [a-z]:
^[a-z]+((\.?[a-z]+)*\.?[a-z])*$
Since \.? are optional, let us remove them:
^[a-z]+(([a-z]+)*[a-z])*$
You can see ([a-z]+)*, which is the classical example of regex which causes catastrophic backtracking (A*)*, and the fact that the outermost repetition (([a-z]+)*[a-z])* can expand to ([a-z]+)*[a-z]([a-z]+)*[a-z]([a-z]+)*[a-z] further exacerbate the problem (imagine the number of permutation to split the input string to match all expansions that your regex can have). And this is not mentioning [a-z]+ in front, which adds insult to injury, since it is of the form A*A*.
Solution
You can use this regex to validate the string according to your conditions:
^(?=[a-zA-Z0-9])[a-zA-Z0-9_~-]++(\.[a-zA-Z0-9_~-]++)*+(?<=[a-zA-Z0-9])$
As Java string literal:
"^(?=[a-zA-Z0-9])[a-zA-Z0-9_~-]++(\\.[a-zA-Z0-9_~-]++)*+(?<=[a-zA-Z0-9])$"
Breakdown of the regex:
^ # Assert beginning of the string
(?=[a-zA-Z0-9]) # Must start with alphanumeric, no special
[a-zA-Z0-9_~-]++(\.[a-zA-Z0-9_~-]++)*+
(?<=[a-zA-Z0-9]) # Must end with alphanumeric, no special
$ # Assert end of the string
Since . can't appear consecutively, and can't start or end the string, we can consider it a separator between strings of [a-zA-Z0-9_~-]+. So we can write:
[a-zA-Z0-9_~-]++(\.[a-zA-Z0-9_~-]++)*+
All quantifiers are made possessive to reduce stack usage in Oracle's implementation and make the matching faster. Note that it is not appropriate to use them everywhere. Due to the way my regex is written, there is only one way to match a particular string to begin with, even without possessive quantifier.
Shorthand
Since this is Java and in default mode, you can shorten a-zA-Z0-9_ to \w and [a-zA-Z0-9] to [^\W_] (though the second one is a bit hard for other programmer to read):
^(?=[^\W_])[\w~-]++(\.[\w~-]++)*+(?<=[^\W_])$
As Java string literal:
"^(?=[^\\W_])[\\w~-]++(\\.[\\w~-]++)*+(?<=[^\\W_])$"
If you use the regex with String.matches(), the anchors ^ and $ can be removed.
As #MarounMaroun already commented, you don't really have a pattern. It might be better to iterate over the string as in the following method:
public static boolean validate(String string) {
char chars[] = string.toCharArray();
if (!isSpecial(chars[0]) && !isLetterOrDigit(chars[0]))
return false;
if (!isSpecial(chars[chars.length - 1])
&& !isLetterOrDigit(chars[chars.length - 1]))
return false;
for (int i = 1; i < chars.length - 1; ++i)
if (!isPunctiation(chars[i]) && !isLetterOrDigit(chars[i])
&& !isSpecial(chars[i]))
return false;
return true;
}
public static boolean isPunctiation(char c) {
return c == '.' || c == ',';
}
public static boolean isSpecial(char c) {
return c == '-' || c == '_' || c == '~';
}
public static boolean isLetterOrDigit(char c) {
return (Character.isDigit(c) || (Character.isLetter(c) && (Character
.getType(c) == Character.UPPERCASE_LETTER || Character
.getType(c) == Character.LOWERCASE_LETTER)));
}
Test code:
public static void main(String[] args) {
System.out.println(validate("aB78."));
System.out.println(validate("aB78..ab "));
System.out.println(validate("abcdef"));
System.out.println(validate("aB78,1"));
System.out.println(validate("aB78 abc"));
}
Output:
false
false
true
true
false
A solution should try and find negatives rather than try and match a pattern over the entire string.
Pattern bad = Pattern.compile( "[^-\\W.~]|\\.\\.|^\\.|\\.$" );
for( String str: new String[]{ "aB78.", "aB78..ab", "abcdef",
"aB78,1", "aB78 abc" } ){
Matcher mat = bad.matcher( str );
System.out.println( mat.find() );
}
(It is remarkable to see how the initial statement "string...should have only" leads programmers to try and create positive assertions by parsing or matching valid characters over the full length rather than the much simpler search for negatives.)

Regular Expression allows specific special characters in java

I need to know the regular expression for string that contains alphanumeric characters, #, underscore(_), full stop(.)and not any blank spaces. And also for alphanumeric characters and it allow spaces. I tried with this regex,
^[_A-Za-z0-9-\\.\\#]$ and ^[A-Za-z0-9-\\s]$
CODE:
private static final String Username_REGEX ="^[_A-Za-z0-9.#-]$";
public static boolean isUsername(EditText editText, boolean required) {
return isValid(editText, Username_REGEX,Username_MSG, required);
}
public static boolean isValid(EditText editText, String regex, String errMsg, boolean required) {
String text = editText.getText().toString().trim();
editText.setError(null);
if ( required && !hasTextemt(editText) ) return false;
if (required && !Pattern.matches(regex, text)) {
editText.setError(errMsg);
return false;
};
return true;
}
public static boolean hasTextemt(EditText editText) {
String text = editText.getText().toString().trim();
editText.setError(null);
if (text.length() == 0) {
editText.setError(emt);
return false;
}
return true;
}
Is this correct? I did not get proper result. Can anyone guide me?
Move the dash - at the end of the character class:
^[_A-Za-z0-9.#-]+$
and
^[A-Za-z0-9\\s-]+$
Between two characters it means a range.
Edit: You also need a + modifier to match one or more of the characters in the character class.
I am assuming that you are getting this input via an EditText widget. So inside the layout of the XML file you can add the following properties by which it will receive only specified characters. :
android:digits="abcdefghijklmnopqrstuvwxyz0123456789,.-#_"
note that it wont allow any capital letter.
just add any digits/keys you want your user to be able to enter. If you are not worried about the patterns and number of occurrence of any character then you don't even need any regex.
Hope it helps
Try
"[\\w#\\.]+" //for alphanumeric, #, .
"[\\w\\s]+" //for alphanumeric, spaces
Add ^ and $ if you need that matches the whole word.
PS: For testing regexp I always use RegexPlanet (not spam :P)
Hope it helps.
You are only missing a quantifier. In your expression ^[_A-Za-z0-9.#-]$, the character class [_A-Za-z0-9.#-] matches exactly one character out of the class. To allow repeated characters, you need to define a quantifier.
* short for {0,} matches 0 or more characters (==> this allows the empty string!)
+ short for {1,} matches 1 or more characters
{n,m} matches minimum n and maximum m characters.
So your regex would look like
^[_A-Za-z0-9.#-]+$
if you require 1 or more characters, or
^[_A-Za-z0-9.#-]{6,20}$
if you want at least 6 characters and at most 20.
Other things:
You can replace _A-Za-z0-9 by \w, but be aware, \w is Unicode based and contains all letters and digits from all languages.
A-Za-z is only ASCII, maybe you want to have a look at Unicode properties. With e.g. \p{L} you can match a letter of any language.
You're missing a plus sign (meaning one or more) at the end of the character class, and you can simplify considerably:
^[\\w.#]+$
Characters within a character class lose their special meanings so don't need to be escaped, except for square brackets and a couple of others.
For alphanumeric and spaces only, that is only combinations of letters, numbers and spaces:
^[a-zA-Z0-9 ]+$

Java: check if string ends or starts with a special character [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
JAVA: check a string if there is a special character in it
I'm trying to create a method to check if a password starts or ends with a special character. There were a few other checks that I have managed to code, but this seems a bit more complicated.
I think I need to use regex to do this efficiently. I have already created a method that checks if there are any special characters, but I can't figure out how modify it.
Pattern p = Pattern.compile("\\p{Punct}");
Matcher m = p.matcher(password);
boolean a = m.find();
if (!a)
System.out.println("Password must contain at least one special character!");
According to the book I'm reading I need to use ^ and $ in the pattern to check if it starts or ends with a special character. Can I just add both statements to the existing pattern or how should I start solving this?
EDIT:
Alright, I think I got the non-regex method working:
for (int i = 0; i < password.length(); i++) {
if (SPECIAL_CHARACTERS.indexOf(password.charAt(i)) > 0)
specialCharSum++;
}
Can't you just use charAt to get the character and indexOf to check for whether or not the character is special?
final String SPECIAL_CHARACTERS = "?#"; // And others
if (SPECIAL_CHARACTERS.indexOf(password.charAt(0)) >= 0
|| SPECIAL_CHARACTERS.indexOf(password.charAt(password.length() - 1)) >= 0) {
System.out.println("password begins or ends with a special character");
}
I haven't profiled (profiling is the golden rule for performance), but I would expect iterating through a compile-time constant string to be faster than building and executing a finite-state automaton for a regular expression. Furthermore, Java's regular expressions are more complex than FSAs, so I would expect that Java regular expressions are implemented differently and are thus slower than FSAs.
The simplest approach would be an or with grouping.
Pattern p = Pattern.compile("(^\\p{Punct})|(\\p{Punct}$)");
Matcher m = p.matcher(password);
boolean a = m.find();
if (!a)
System.out.println("Password must contain at least one special character at the beginning or end!");
Use this pattern:
"^\\p{Punct}|\\p{Punct}$"
^\\p{Punct} = "start of string, followed by a punctuation character
| = "or"
\\p{Punct}$ = "punctuation character, followed by end of string"

Categories