I have a string and a simple pattern (a string with a wildcard). When I use the match function I would it expect it to return true for my text, but it doesn't it returns false.
String text = "test_1_2_3";
String pattern = "test_*"
text.matches(pattern);//this returns false
_* will matches the character _ literally between zero and more times ,instead you need .* that match any character between zero and more times:
"test_.*"
Demo
pattern = "test_*" means "test" and 0 or more "_"
Because your test_* pattern, combined with Matcher#matches, will match a whole input (i.e. from start to end), that matches the following conditions:
starts with test
followed by (and ending with) 0 instance of _, or more (greedy-quantified here).
Using Matcher#find would return true in this case, since it would match a partial test_.
So, your matches invocation would return true with the given Pattern, with inputs such as:
test_
test__
... and so on.
See API.
Your regexp will match test followed by zero or more '_' character.
I think you want this:
String text = "test_1_2_3";
String pattern = "test_.*";
Related
My string could be in the form like this:
"A3C10" or "A3B00" or "A3F90".
I want to return true if the string contains "A3" in the first two substring and "0" in the last substring index. Is there a way to write the regex pattern String matching here?
You can use regex for that:
string.matches("A3.*0");
It returns true, if string begins with "A3" and ends with "0".
Following on from Andronicus' answer, .* will match any sequence, so as long as it begins with A3 and ends with 0 it will return true.
If you want to match the exact pattern of A3XX0 where X is any character, then use the below pattern.
string.matches("A3..0");
I'm trying to identify strings which contain exactly one integer.
That is exactly one string of contiguous digits e.g. "1234" (no dots, no commas).
So I thought this should do it: (This is with the Java String Escapes included):
(\\d+){1,}
So the "\d+" correctly a string of contiguous digits. (right?)
I included this expression as a sub-expression within "(" and ")" and then I'm trying to say "only one of these sub-expressions.
Here's the result of ( matcher.find() ) of checking various strings:
(note the regex from now on is'raw' here - NOT Java String Escaped).
Pattern:(\d+){1,}
Input String Result
1 true
XX-1234 true
do-not-match-no-integers false
do-not-match-1234-567 true
do-not-match-123-456 true
It seems the '1' in the pattern is applying to the "+\d" string, rather than the number of those contiguous strings.
Because if I change the number from 1 to 4; I can see the result change to the following:
Pattern:(\d+){4,}
Input String Result
1 false
XX-1234 true
do-not-match-no-integers false
do-not-match-1234-567 true
do-not-match-123-456 false
What am I missing here ?
Out of interest - if I take off the "(" and ")" altogether - I'm getting a different result again
Pattern:\d+{4,}
Input String Result
1 true
XX-1234 true
do-not-match-no-integers false
do-not-match-1234-567 true
do-not-match-123-456 true
Matcher.find() will try to find a match inside the String. You should try Matcher.matches() instead to see if the pattern fits in all the string.
In this way, the pattern you need is \d+
EDIT:
Seems that I misunderstood the question. One way to find if the String has only one integer, using the same pattern is:
int matchCounter = 0;
while (Matcher.find() || matchCounter < 2){
matchCounter++;
}
return matchCounter == 1
This is the regex:
^[^\d]*\d+[^\d]*$
That's zero or more non digits, followed by a substring of digits and then zero or more non digits again until the end of the string. Here is the java code (with escaped slashes):
class MainClass {
public static void main(String[] args) {
String regex="^[^\\d]*\\d+[^\\d]*$";
System.out.println("1".matches(regex)); // true
System.out.println("XX-1234".matches(regex)); // true
System.out.println("XX-1234-YY".matches(regex)); // true
System.out.println("do-not-match-no-integers".matches(regex)); // false
System.out.println("do-not-match-1234-567".matches(regex)); // false
System.out.println("do-not-match-123-456".matches(regex)); // false
}
}
You can use the RegEx ^\D*?(\d+)\D*?$
^\D*? makes sure there is no digits between the start of your line and your first group
(\d+) matches your digits
\D*?$ makes sure there is no digits between the your first group and the end of your line
Demo.
So, for your Java String, it would be : ^\\D*?(\\d+)\\D*?$
I think you will have to make sure your regex considers the entire string, using ^ and $.
To do that, you could match zero or more non-digits, followed by 1 or more digits, and then zero or more non-digits.
The following should do the trick:
^[^\d]*(\d+)[^\d]*$
Here it is on regex101.com: https://regex101.com/r/CG0RiL/2
Edit: As pointed out by Veselin Davidov my regex isn't correct.
If i understand you right you want it only to say true when the entire String matches the pattern. yes?
Then you have to call matcher.matches();
Also i think your pattern must be just \d+.
If you have problem with regex i can recommend you https://regex101.com/ it explains you why it matches something and gives you a quick preview.
I use it every time i have to write regex.
Input
example("This is tes't")
example('This is the tes\"t')
Ouput should be
This is tes't
This is the tes"t
Code
String text = "example(\"This is tes't\")";
//String text = "$.i18nMessage('This is the tes\"t\')";
final String quoteRegex = "example.*?(\".*?\")?('.*?')?";
Matcher matcher0 = Pattern.compile(quoteRegex).matcher(text);
while (matcher0.find()) {
System.out.println(matcher0.group(1));
System.out.println(matcher0.group(2));
}
I see output as
null
null
Though when i use regex example.*?(\".*?\") it returns This is tes't and when i use example.*?('.*?') it returns
This is the tes"t but whn i combine both with example.*?(\".*?\")?('.*?')? it returns null . Why ?
The .*?(\".*?\")?('.*?')? subpattern sequence at the end of your regex can match an empty string (all 3 parts are quantified with * / *? that match 0 or more chars). After matcing example, the .*? is skipped at first, and is only expanded once the subsequent subpatterns do not match. However, they both match an empty string before (, thus, you only have example in matcher0.group(0).
Use either an alternation that makes group 1 obligatory (demo):
Pattern.compile("example.*?(\".*?\"|'.*?')"
Or a variant with a tempered greedy token (demo) that allows to get rid of the alternation:
Pattern.compile("example.*?(([\"'])(?:(?!\\2).)*\\2)"
Or, better, support escaped sequences (another demo):
Pattern.compile("example.*?(\"[^\"\\\\]*(?:\\\\.[^\"\\\\]*)*\"|'[^'\\\\]*(?:\\\\.[^'\\\\]*)*')"
In all 3 examples, you only need to access Group 1. If there can only be ( between example and " or ', you should replace .*? with \( since it will make matching safer. Although, it is never too safe to use a regex to match string literals (at least, with one regex).
After trying other variations, I use this regular expression in Java to validate a password:
PatternCompiler compiler = new Perl5Compiler();
PatternMatcher matcher = new Perl5Matcher();
pattern = compiler.compile("^(?=.*?[a-zA-Z])(?![\\\\\\\\_\-])(?=.*?[0-9])([A-Za-z0-9-/-~]
[^\\\\\\\\_\-]*)$");
But it still doesn't match my test cases as expected:
Apr#2017 match
$$Apr#2017 no match, but it should match
!!Apr#2017 no match, but it should match
!#ap#2017 no match, but it should match
-Apr#2017 it should not match
_Apr#2017 it should not match
\Apr#2017 it should not match
Except three special characters - _ \ remaining, all should match at the start of the string.
Rules:
It should accept all special characters any number of times except above three symbols.
It must and should contain one number and Capital letter at any place in the string.
You have two rules, why not create more than one regular expression?
It should accept all special characters any number of times except above three symbols.
For this one, make sure it does not match [-\\_] (note that the - is the first character in the character class or it will be interpreted as a range.
It must and should contain one number and Capital letter at any place in the string.
For this one, make sure it matches [A-Z] and [0-9]
To make it easy to modify and extend, do some abstraction:
class PasswordRule
{
private Pattern pattern;
// If true, string must match, if false string must not match
private boolean shouldMatch;
PasswordRule(String patternString, boolean shouldMatch)
{
this.shouldMatch = shouldMatch;
this.pattern = compiler.compile(patternString);
}
boolean match(String passwordString)
{
return pattern.matches(passwordString) == shouldMatch;
}
}
I don't know or care if I have the API to Perl5 matching correct in the above, but you should get the idea. Then your rules go in an array
PasswordRule rules[] =
{
PasswordRule("[-\\\\_]", false),
PasswordRule("[A-Z]", true),
PasswordRule("[0-9]", true)
};
boolean passwordIsOk(String password)
{
for (PasswordRule rule : rules)
{
if (!rule.match(password)
{
return false;
}
}
return true;
}
Using the above, your rules are far more flexible and modifiable than one monstrous regular expression.
Here's an alternative solution - reverse the condition. This regex
^(?:[^0-9]*|[^A-Z]*|[_\\-].*)$
matches non conforming passwords. This makes it much simpler to understand.
It matches either
a string free from digits
a string free from capital letters
a string containing either of _, \ or -
See it illustrated here at regex101.
There are some unclear issues remaining in your question though, so it may have to be adjusted. (The restriction in starting character I mentioned as a comment)
You seem to need
"^(?=[^a-zA-Z]*[a-zA-Z])(?=[^0-9]*[0-9])[^\\\\_-]*$"
See the regex demo
^ - start of string
(?=[^a-zA-Z]*[a-zA-Z]) - a positive lookahead that requires at least 1 ASCII letter ([a-zA-Z]) to appear after 0+ chars other than letters ([^a-zA-Z]*)
(?=[^0-9]*[0-9])- at least 1 ASCII digit (same principle of contrast as above is used here)
[^\\\\_-]* - 0+ chars other than \ (inside a Java string literal, \ should be doubled to denote 1 literal backslash, and to match a single backslash with a regex, we need double literal backslash), _, -
$ - end of string (\\z might be better though as it matches at the very end of the string).
I have the following (Java) code:
public class TestBlah {
private static final String PATTERN = ".*\\$\\{[.a-zA-Z0-9]+\\}.*";
public static void main(String[] s) throws IOException {
String st = "foo ${bar}\n";
System.out.println(st.matches(PATTERN));
System.out.println(Pattern.compile(PATTERN).matcher(st).find());
System.exit(0);
}
}
Running this code, the former System.out.println outputs false, while the latter outputs true
Am I not understanding something here?
This is because the . will not match the new line character. Thus, your String that contains a new line, will not match a string that ends with .*. So, when you call matches(), it returns false, because the new line doesn't match.
The second one returns true because it finds a match inside the input string. It doesn't necessarily match the whole string.
From the Pattern javadocs:
. Any character (may or may not match line terminators)
String.matches(..) behaves like Matcher.matches(..). From the documentation of Matcher
find(): Attempts to find the next subsequence of
the input sequence that matches the pattern.
matches(): Attempts to match the entire input sequence
against the pattern.
So you could think of matches() as if it surrounded your regexp with ^ and $ to make sure the beginning of the string matches the beginning of your regular expression and the end of the string matches the end of the regular expression.
There is a difference between matching a pattern and finding the pattern in a String
String.matches() :
Tells whether or not this string matches the given regular expression.
Your whole string must match the pattern.
Matcher.matches() :
Attempts to match the entire input sequence against the pattern.
Again your whole string must match.
Matcher.find() :
Attempts to find the next subsequence of the input sequence that matches the pattern.
Here you only need a "partial match".
As #Justin said :
Your matches() can't work as the . won't match new line characters (\n, \r and \r\n).
Resources :
Javadoc - String.matches()
Javadoc - Matcher.matches()
Javadoc - Matcher.find()