String with N digits - java

I need some help with regex in Java.
We have string, and I want that String.matches give me "true" if our string contains N digits.
For example(N = 12):
+012345678900 - true
0123-4567-0000 - true;
but:
+0123456789 - false
0123-4567-000000 - false.
I tried this one (.*[0-9].*){N}and this one ^(.*[0-9].*){N}$. But it was incorrectly.

You may try this,
^(?:\\D*\\d){12}\\D*$
matches method won't need anchors, so
(?:\\D*\\d){12}\\D*
would be enough..
\\D matches any character but not of digit. So (?:\\D*\\d){12} ensures that there must be any no of non-dgit chars but it must contain exactly 12 digits. Last \\D* matches zero or more non-digit characters.

Related

Given string filter out unique element from string using regex

I have this String and I want to filter the digit that came after the big number with the space, so in this case I want to filter out 2 and 0.32. I used this regex below which only filters out decimal numbers, however I want to filter both decimals and integer numbers, is there any way?
String s = "ABB123,ABPP,ADFG0/AA/BHJ.S,392483492389 2,BBBB,YUIO,BUYGH/AA/BHJ.S,3232489880 0.32"
regex = .AA/BHJ.S,\d+ (\d+.?\d+)
https://regex101.com/r/ZqHDQ8/1
The problem is that \d+.?\d+ matches at least two digits. \d+ matches one or more digits, then .? matches any optional char other than line break char, and then again \d+ matches (requires) at least one digit (it matches one or more).
Also, note that all literal dots must be escaped.
You can use
.AA/BHJ\.S,\d+\s+(\d+(?:\.\d+)?)
See the regex demo.
Details:
. - any one char
AA/BHJ\.S, - a AA/BHJ.S, string
\d+ - one or more digits
\s+ - one or more whitespaces
(\d+(?:\.\d+)?) - Group 1: one or more digits, and then an optional sequence of a dot and one or more digits.
You could look for anything following /AA/BHJ with a reluctant quantifier, then use a capturing group to look for either digits or one or more digits followed by a decimal separator and other digits.
/AA/BHJ.*?\s+(\d+\.\d+|\d+)
Here is a link to test the regex:
https://regex101.com/r/l5nMrD/1

Having difficulty understanding Java regex interpretation [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 3 years ago.
Can someone help me with the following Java regex expression? I've done some research but I'm having a hard time putting everything together.
The regex:
"^-?\\d+$"
My understandning of what each symbol does:
" = matches the beginning of the line
- = indicates a range
? = does not occur or occurs once
\\d = matches the digits
+ = matches one or more of the previous thing.
$ = matches end of the line
Is the regex saying it only want matches that start or end with digits? But where do - and ? come in?
- only indicates a range if it's within a character class (i.e. square brackets []). Otherwise, it's a normal character like any other. With that in mind, this regex matches the following examples:
"-2"
"3"
"-700"
"436"
That is, a positive or negative integer: at least one digit, optionally preceded by a minus sign.
Some regex is composed, as you have now, the correct way to read your regex is :
^ start of word
-? optional minus character
\\d+ one or more digits
$ end of word
This regex match any positive or negative numbers, like 0, -15, 558, -19663, ...
Fore details check this good post Reference - What does this regex mean?
"^-?\\d+$" is not a regex, it's a Java string literal.
Once the compiler has parsed the string literal, the string value is ^-?\d+$, which is a regex matching like this:
^ Matches beginning of input
- Matches a minus sign
? Makes previous match (minus sign) optional
\d Matches a digit (0-9)
+ Makes previous match (digit) match repeatedly (1 or more times)
$ Matches end of input
All-in-all, the regex matches a positive or negative integer number of unlimited length.
Note: A - only denotes a range when inside a [] character class, e.g. [4-7] is the range of characters between '4' and '7', while [3-] and [-3] are not ranges since the start/end value is missing, so they both just match a 3 or - character.

Java Regex ? (expr){num} confusion?

I'm trying to identify strings which contain exactly one integer.
That is exactly one string of contiguous digits e.g. "1234" (no dots, no commas).
So I thought this should do it: (This is with the Java String Escapes included):
(\\d+){1,}
So the "\d+" correctly a string of contiguous digits. (right?)
I included this expression as a sub-expression within "(" and ")" and then I'm trying to say "only one of these sub-expressions.
Here's the result of ( matcher.find() ) of checking various strings:
(note the regex from now on is'raw' here - NOT Java String Escaped).
Pattern:(\d+){1,}
Input String Result
1 true
XX-1234 true
do-not-match-no-integers false
do-not-match-1234-567 true
do-not-match-123-456 true
It seems the '1' in the pattern is applying to the "+\d" string, rather than the number of those contiguous strings.
Because if I change the number from 1 to 4; I can see the result change to the following:
Pattern:(\d+){4,}
Input String Result
1 false
XX-1234 true
do-not-match-no-integers false
do-not-match-1234-567 true
do-not-match-123-456 false
What am I missing here ?
Out of interest - if I take off the "(" and ")" altogether - I'm getting a different result again
Pattern:\d+{4,}
Input String Result
1 true
XX-1234 true
do-not-match-no-integers false
do-not-match-1234-567 true
do-not-match-123-456 true
Matcher.find() will try to find a match inside the String. You should try Matcher.matches() instead to see if the pattern fits in all the string.
In this way, the pattern you need is \d+
EDIT:
Seems that I misunderstood the question. One way to find if the String has only one integer, using the same pattern is:
int matchCounter = 0;
while (Matcher.find() || matchCounter < 2){
matchCounter++;
}
return matchCounter == 1
This is the regex:
^[^\d]*\d+[^\d]*$
That's zero or more non digits, followed by a substring of digits and then zero or more non digits again until the end of the string. Here is the java code (with escaped slashes):
class MainClass {
public static void main(String[] args) {
String regex="^[^\\d]*\\d+[^\\d]*$";
System.out.println("1".matches(regex)); // true
System.out.println("XX-1234".matches(regex)); // true
System.out.println("XX-1234-YY".matches(regex)); // true
System.out.println("do-not-match-no-integers".matches(regex)); // false
System.out.println("do-not-match-1234-567".matches(regex)); // false
System.out.println("do-not-match-123-456".matches(regex)); // false
}
}
You can use the RegEx ^\D*?(\d+)\D*?$
^\D*? makes sure there is no digits between the start of your line and your first group
(\d+) matches your digits
\D*?$ makes sure there is no digits between the your first group and the end of your line
Demo.
So, for your Java String, it would be : ^\\D*?(\\d+)\\D*?$
I think you will have to make sure your regex considers the entire string, using ^ and $.
To do that, you could match zero or more non-digits, followed by 1 or more digits, and then zero or more non-digits.
The following should do the trick:
^[^\d]*(\d+)[^\d]*$
Here it is on regex101.com: https://regex101.com/r/CG0RiL/2
Edit: As pointed out by Veselin Davidov my regex isn't correct.
If i understand you right you want it only to say true when the entire String matches the pattern. yes?
Then you have to call matcher.matches();
Also i think your pattern must be just \d+.
If you have problem with regex i can recommend you https://regex101.com/ it explains you why it matches something and gives you a quick preview.
I use it every time i have to write regex.

Java-flavoured Regex : Match whole string if group is n chars

I'm trying to create a Regex for a String validator. My String must be exactly 8 characters long, and begin with a letter (lowercase or uppercase) or a number. It can only contain letters (lowercase and uppercase), numbers or whitespaces right after that first character. If a whitespace is found, there can only be whitespaces after it.
For now, I have the match group for the second part : [a-zA-Z0-9]{1,}\s*
I can't find a way to specify that this group is matched only if it has exactly 8 characters. I tried ^([a-zA-Z0-9]{1,}\s*){8}$ but this is not the expected result.
Here are some test cases (with trailing whitespaces).
Valid :
9013
20130
89B
A5000000
Invalid :
9013
20130
90 90
123456789
There probably is a smart regex way to do it but you could also first check the length of the string:
input.length() == 8 && input.matches("[a-zA-Z0-9]+\\s*")
This is also probably more efficient than a complex regex.
You can use this lookahead based regex:
^[a-zA-Z0-9](?!.* [a-zA-Z0-9])[a-zA-Z0-9 ]{7}$
RegEx Demo
^[a-zA-Z0-9] matches an alpha-num char at start
(?!.* [a-zA-Z0-9]) is negative lookahead to make sure that there is no instance of an alpha-num char followed by a space.
[a-zA-Z0-9 ]{7}$ matches 7 chars containing alpha-num char or space.

Why does "3.5".matches("[0-9]+") return false?

I use the method String.matches(String regex) to find if a string matches the regex expression
From my point of view the regular expression regex="[0-9]+" means a String that contains at least one figure between 0 and 9
But when I debug "3.5".matches("[0-9]+") it returns false.
So what is wrong ?
matches determines if the regex matches the whole string. It won't return true if the string contains a match.
To test if the string contains a match to a given regex, use Pattern.compile(regex).matcher(string).find().
(Your regex, [0-9]+, will match any string that contains only digits from 0 to 9, and at least one digit. It doesn't magically match against any real number. If you want something matching any real number, look at e.g. the Javadoc for Double.valueOf(String), which specifies a regex used in validating doubles. That regex allows hexadecimal input, NaNs, and infinities, but it should give you a better idea of what's required.)
Alternately, edit the regex so it directly matches any string containing one or more digits, e.g. .*[0-9]+.* would do the job.
If you want to match decimal numbers, your reg ex needs to be \d*\.?\d+. If you want negatives as well, then \-?\d*\.?\d+.
. is not 0-9 and matches tests the entire string.

Categories