i trying to write a regular expression for match a string starting with letter "G" and second index should be any number (0-9) and rest of the string can be contain any thing and can be any length,
i'm stuck in following code
String[] array = { "DA4545", "G121", "G8756942", "N45", "4578", "#45565" };
String regExp = "^[G]\\d[0-9]";
for(int i = 0; i < array.length; i++)
{
if(Pattern.matches(regExp, array[i]))
{
System.out.println(array[i] + " - Successful");
}
}
output:
G12 - Successful
why is not match the 3 index "G8756942"
G - the letter G
[0-9] - a digit
.* - any sequence of characters
So the expression
G[0-9].*
will match a letter G followed by a digit followed by any sequence of characters.
when you write \d it already means [0-9]
so when you say \d[0-9] that means two digits exactly
better use :
^G\\d*
which will match all words starting with G and having zero or more digits
"^[G]\\d[0-9]"
This regex matches "G" followed by \\d, then another number.
Use one of these:
"^G\\d"
"^G[0-9]"
Also note that you don't need a character class since it only contains one letter, so it's redundant.
try this regex .* will match any character after digit
^G\\d.*
http://regex101.com/r/uE4tX1/1
why is not match the 3 index "G8756942"
Because you match for a string starting with G, followed by a \, a d and exactly one digit. Solution:
^[G]\d
This regex would be fine.
"G\\d.*"
Because matches method tries to match the whole input, you need to add .* at the last in your pattern and also you don't need to include anchors.
String[] array = { "DA4545", "G121", "G8756942", "N45", "4578", "#45565" };
String regExp = "G\\d.*";
for(int i = 0; i < array.length; i++)
{
if(Pattern.matches(regExp, array[i]))
{
System.out.println(array[i] + " - Successful");
}
}
Output:
G121 - Successful
G8756942 - Successful
Related
I have following regex:
\+?[0-9\.,()\-\s]+$
which allows:
optional + at the beginning
then numbers, dots, commas, round brackets, dashes and white spaces.
In addition to that I need to make sure that amount of numbers and plus symbol (if exists) has length between 9 and 15 (so I'm not counting any special characters apart from + symbol).
And this last condition is what I'm having problem with.
valid inputs:
+358 (9) 1234567
+3 5 8.9,1-2(3)4..5,6.7 (25 characters but only 12 characters that counts (numbers and plus symbol))
invalid input:
+3 5 8.9,1-2(3)4..5,6.777777777 (33 characters and only 20 characters that counts (numbers and plus symbol) is too many)
It is important to use regex if possible because it's used in javax.validation.constraints.Pattern annotation as:
#Pattern(regexp = REGEX)
private String number;
where my REGEX is what I'm looking for here.
And if regex cannot be provided then it means that I need to rewrite my entity validation implementation. So is it possible to add such condition to regex or do I need a function to validate such pattern?
You may use
^(?=(?:[^0-9+]*[0-9+]){9,15}[^0-9+]*$)\+?[0-9.,()\s-]+$
See the regex demo
Details
^ - start of string
(?=(?:[^0-9+]*[0-9+]){9,15}[^0-9+]*$) - a positive lookahead whose pattern must match for the regex to find a match:
(?:[^0-9+]*[0-9+]){9,15} - 9 to 15 repetitions of
[^0-9+]* - any 0+ chars other than digits and + symbol
[0-9+] - a digit or +
[^0-9+]* - 0+ chars other than digits and +
$ - end of string
\+? - an optional + symbol
[0-9.,()\s-]+ - 1 or more digits, ., ,, (, ), whitespace and - chars
$ - end of string.
In Java, when used with matches(), the ^ and $ anchors may be omitted:
s.matches("(?=(?:[^0-9+]*[0-9+]){9,15}[^0-9+]*$)\\+?[0-9.,()\\s-]+")
Not using regex, you could simply loop and count the numbers and +s:
int count = 0;
for (int i = 0; i < str.length(); i++) {
if (Character.isDigit(str.charAt(i)) || str.charAt(i) == '+') {
count++;
}
}
Since you're using Java, I wouldn't rely solely on a regex here:
String input = "+123,456.789";
int count = input.replaceAll("[^0-9+]", "").length();
if (input.matches("^\\+?[0-9.,()\\-\\s]+$") && count >= 9 && count <= 15) {
System.out.println("PASS");
}
else {
System.out.println("FAIL");
}
This approach allows us to just use straightaway your original regex. We handle the length requirements of numbers (and maybe plus) using Java string calls.
String always consists of two distinct alternating characters. For example, if string 's two distinct characters are x and y, then t could be xyxyx or yxyxy but not xxyy or xyyx.
But a.matches() always returns false and output becomes 0. Help me understand what's wrong here.
public static int check(String a) {
char on = a.charAt(0);
char to = a.charAt(1);
if(on != to) {
if(a.matches("["+on+"("+to+""+on+")*]|["+to+"("+on+""+to+")*]")) {
return a.length();
}
}
return 0;
}
Use regex (.)(.)(?:\1\2)*\1?.
(.) Match any character, and capture it as group 1
(.) Match any character, and capture it as group 2
\1 Match the same characters as was captured in group 1
\2 Match the same characters as was captured in group 2
(?:\1\2)* Match 0 or more pairs of group 1+2
\1? Optionally match a dangling group 1
Input must be at least two characters long. Empty string and one-character string will not match.
As java code, that would be:
if (a.matches("(.)(.)(?:\\1\\2)*\\1?")) {
See regex101.com for working examples1.
1) Note that regex101 requires use of ^ and $, which are implied by the matches() method. It also requires use of flags g and m to showcase multiple examples at the same time.
UPDATE
As pointed out by Austin Anderson:
fails on yyyyyyyyy or xxxxxx
To prevent that, we can add a zero-width negative lookahead, to ensure input doesn't start with two of the same character:
(?!(.)\1)(.)(.)(?:\2\3)*\2?
See regex101.com.
Or you can use Austin Anderson's simpler version:
(.)(?!\1)(.)(?:\1\2)*\1?
Actually your regex is almost correct but problem is that you have enclosed your regex in 2 character classes and you need to match an optional 2nd character in the end.
You just need to use this regex:
public static int check(String a) {
if (a.length() < 2)
return 0;
char on = a.charAt(0);
char to = a.charAt(1);
if(on != to) {
String re = on+"("+to+on+")*"+to+"?|"+to+"("+on+to+")*"+on+"?";
System.out.println("re: " + re);
if(a.matches(re)) {
return a.length();
}
}
return 0;
}
Code Demo
I would like to mask the last 4 digits of the identity number (hkid)
A123456(7) -> A123***(*)
I can do this by below:
hkid.replaceAll("\\d{3}\\(\\d\\)", "***(*)")
However, can my regular expression really can match the last 4 digit and replace by "*"?
hkid.replaceAll(regex, "*")
Please help, thanks.
Jessie
Personally, I wouldn't do it with regular expressions:
char[] cs = hkid.toCharArray();
for (int i = cs.length - 1, d = 0; i >= 0 && d < 4; --i) {
if (Character.isDigit(cs[i])) {
cs[i] = '*';
++d;
}
}
String masked = new String(cs);
This goes from the end of the string, looking for digit characters, which it replaces with a *. Once it's found 4 (or reaches the start of the string), it stops iterating, and builds a new string.
While I agree that a non-regex solution is probably the simplest and fastest, here's a regex to catch the last 4 digits independent if there is a grouping ot not: \d(?=(?:\D*\d){0,3}\D*$)
This expression is meant to match any digit that is followed by 0 to 3 digits before hitting the end of the input.
A short breakdown of the expression:
\d matches a single digit
\D matches a single non-digit
(?=...) is a positive look-ahead that contributes to the match but isn't consumed
(?:...){0,3} is a non-capturing group with a quantity of 0 to 3 occurences given.
$ matches the end of the input
So you could read the expression as follows: "match a single digit if it is followed by a sequence of 0 to 3 times any number of non-digits which are followed by a single digit and that sequence is followed by any number of non-digits and the end of the input" (sounds complicated, no?).
Some results when using input.replaceAll( "\\d(?=(?:\\D*\\d){0,3}\\D*$)", "*" ):
input = "A1234567" -> output = "A123****"
input = "A123456(7)" -> output = "A123***(*)"
input = "A12345(67)" -> output = "A123**(**)"
input = "A1(234567)" -> output = "A1(23****)"
input = "A1234B567" -> output = "A123*B***"
As you can see in the last example the expression will match digits only. If you want to match letters as well either replace \d and \D with \w and \W (note that \w matches underscores as well) or use custom character classes, e.g. [02468] and [^02468] to match even digits only.
I'm looking for the regex expression that will detect repeating symbols in a String. And currently I didn't found solution that fits all my requirements.
Requirements are pretty simple:
detect any repeating symbol in a String;
to be able to setup repeating count (eg. more than twice)
Examples of required detection (of symbol 'a', more than 2 times, true if detects, false otherwise)
"Abcdefg" - false
"AbcdaBCD" - false
"abcd_ab_ab" - true (symbol 'a' used three times)
"aabbaabb" - true (symbols 'a' used four times)
Since I'm not a pro in regex and usage of them - code snippet and explanation would be appreciated!
Thanks!
I think that
(.).*\1
would work:
(.) match a single character and capture
.* match any intervening characters
\1 match the captured group again.
(You'd need to compile with the DOTALL flag, or replace . with [\s\S] or similar if the string contains characters not ordinarily matched by .)
and if you want to require that it is found at least 3 times, just change the quantifier of the second two bullets:
(.)(.*\1){2}
etc.
This is going to be pretty inefficient, though, because it's going to have to do the "search for the next matching character" between every character in the string and the end of the string, making it at least quadratic.
You might be as well off not using regular expressions, e.g.
char[] cs = str.toCharArray();
Arrays.sort(cs);
int n = numOccurrencesRequired - 1;
for (int i = n; i < cs.length; ++i) {
boolean allSame = true;
for (int j = 1; j <= n && allSame; ++j) {
allSame = cs[i] == cs[i - j];
}
if (allSame) return true;
}
return false;
This sorts all of the same characters together, allowing you just to pass over the string once looking for adjacent equal characters.
Note that this doesn't quite work for any symbol: it will split up multi-char codepoints like 🍕. You can adapt the code above to work with codepoints, rather than chars.
Try this regex: (.)(?:.*\1)
It basically matches any character (.) is followed by anything .* and itself \1. If you want to check for 2 or more repeats only add {n,} at the end with n being the number of repeats you want to check for.
Yea, such regex exists but just because the set of characters is finite.
regex: .*(a.*a|b.*b|c.*c|...|y.*y|z.*z).*
It makes no sense. Use another approach:
String string = "something";
int[] count = new int[256];
for (int i = 0; i < string.length; i++) {
int temp = int(string.charAt(i));
count[temp]++;
}
Now you have all characters counted and you can use them as you wish.
I have an ArrayList<String> which I iterate through to find the correct index given a String. Basically, given a String, the program should search through the list and find the index where the whole word matches. For example:
ArrayList<String> foo = new ArrayList<String>();
foo.add("AAAB_11232016.txt");
foo.add("BBB_12252016.txt");
foo.add("AAA_09212017.txt");
So if I give the String AAA, I should get back index 2 (the last one). So I can't use the contains() method as that would give me back index 0.
I tried with this code:
String str = "AAA";
String pattern = "\\b" + str + "\\b";
Pattern p = Pattern.compile(pattern);
for(int i = 0; i < foo.size(); i++) {
// Check each entry of list to find the correct value
Matcher match = p.matcher(foo.get(i));
if(match.find() == true) {
return i;
}
}
Unfortunately, this code never reaches the if statement inside the loop. I'm not sure what I'm doing wrong.
Note: This should also work if I searched for AAA_0921, the full name AAA_09212017.txt, or any part of the String that is unique to it.
Since word boundary does not match between a word char and underscore you need
String pattern = "(?<=_|\\b)" + str + "(?=_|\\b)";
Here, (?<=_|\b) positive lookbehind requires a word boundary or an underscore to appear before the str, and the (?=_|\b) positive lookahead requires an underscore or a word boundary to appear right after the str.
See this regex demo.
If your word may have special chars inside, you might want to use a more straight-forward word boundary:
"(?<![^\\W_])" + Pattern.quote(str) + "(?![^\\W_])"
Here, the negative lookbehind (?<![^\\W_]) fails the match if there is a word character except an underscore ([^...] is a negated character class that matches any character other than the characters, ranges, etc. defined inside this class, thus, it matches all characters other than a non-word char \W and a _), and the (?![^\W_]) negative lookahead fails the match if there is a word char except the underscore after the str.
Note that the second example has a quoted search string, so that even AA.A_str.txt could be matched well with AA.A.
See another regex demo