regex for optional characters

regex for optional characters - java

I am using the following regex:
^([W|w][P|p]|[0-9]){8}$
The above regex accepts wp1234567 (wp+7 digits) also. Whereas expected: WP+6digit or wp+6digit or only 8 digit
For example:
WP123456
wp126456
64535353

Note that [W|w] matches W, w and |, since | inside a character class loses its special meaning of an alternation operator. Also, by setting the grouping (...) around [W|w][P|p]|[0-9] you match 8 occurrences of *the whole sequences of WP or digits.
You should set the correct value in the limited quantifier and remove grouping and use alternation to allow either wp+6 digits or just 8 digits:
^(?:[Ww][Pp][0-9]{6}|[0-9]{8})$
See demo
The regex matches:
^ - start of string (not necessary if you check the whole string with String#matches())
(?:[Ww][Pp][0-9]{6}|[0-9]{8}) - 2 alternatives:
[Ww][Pp][0-9]{6} - W or w followed with P or p followed with 6 digits
| - or...
[0-9]{8} - exactly 8 digits
$ - end of string
Other scenarios (just in case):
If you need to match strings consisting of 7 or 8 digits, you need to replace {8} limited quantifier with {7,8}:
^(?:[Ww][Pp][0-9]{6}|[0-9]{7,8})$
And in case you do not want to match Wp123456 or wP123456, use one more alternation in the beginning:
^(?:(?:WP|wp)[0-9]{6}|[0-9]{8})$

Related

Java Regex Troubles

I have a string that needs to be extracted using regex. It’s preferable that only a single regex is used. As it’s used in a loop with 9 pre-existing Regex’s.(Ie, so i can just add it to the ArrayList of available regex's)
The pattern of strings will always be
Between {4,8} A-Z0-9. Followed by either,
[A-Z]{1} or [A-Z0-9]{2} or, another [A-Z0-9]{4,8}
For example:
“A1B1C1 ABCD E FGHI JK X0Y0Z0”
I’d want this to return four matches.
A1B1C1 & ABCD E & FGHI JK & X0Y0Z0
I've been trying to match the first part of {4,8} chatactures, followed by a non-greedy match for {1,2}. For example(s):
[A-Z0-9]{4,8}(\\s{1}[A-Z0-9]{1,2})*? && [A-Z0-9]{4,8}(\\s{1}[A-Z]{1}|\\s{1}[A-Z0-9]{2})*?
But this never returns more than the first {4,8} charactures.

You could use an optional part with a word boundary and an alternation to match either [A-Z0-9]{2} or [A-Z]
\b[A-Z0-9]{4,8}(?:\h+(?:[A-Z0-9]{2}|[A-Z]))?\b
\b Word boundary
[A-Z0-9]{4,8} Match 4 - 8 times A-Z0-9
(?: Non capture group
\h+ Match 1+ horizontal whitespace chars
(?:[A-Z0-9]{2}|[A-Z]) Match either 2 x A-Z0-9 or 1 x A-Z
)? Close non capture group and make it optional
\b Word boundary
Regex demo | Java demo
In Java
String regex = "\\b[A-Z0-9]{4,8}(?:\\h+(?:[A-Z0-9]{2}|[A-Z]))?\\b";

Java - Regex - Allow 0-9, periods, hypen

I cant build the right regex.
Valid:
1.1.1
1.1-1
1-1.1
1-1-1
1-1
1.1
Invalid:
1..1
1.
1--1
1-
so far i got
^[0-9]+[0-9.-][0-9]+$
thanks for your help

The ^[0-9]+[0-9.-][0-9]+$ pattern matches a string that fully matches the pattern: 1 or more digits ([0-9]+), a digit or . or - ([0-9.-]) and then 1 or more digits ([0-9]+). It can match consecutive - or/and . inside a string of digits.
You may use
^[0-9]+(?:[.-][0-9]+)*$
See the regex demo
If you use it in the .matches() method, the ^ and $ anchors can be omitted.
Details:
^ - start of string
[0-9]+ - 1 or more (the + quantifier matches 1 or more occurrences, if you only need to match a single occurrence remove the + quantifier) digits
(?:[.-][0-9]+)* - zero or more consecutive sequences of
[.-] - a . or -
[0-9]+ - 1 or more digits (the same quantifier note as above applies)
$ - end of string.

This here should do:
^[0-9]([.-][0-9])*$
One digit, followed by zero or more occurrences of (dot/minus digit)

Slight variation on other answers.
You did not indicate the case of a lone digit without period and hyphen:
(invalid)
1- (invalid)
1 (I have assumed this case is invalid)
Also this regex only allows single digits (e.g. 2.2.2, not 22.22.22)
^\d([.-]\d)+$

Both
^[0-9]([.-][0-9])*$
and
^[0-9]+(?:[.-][0-9]+)*$
works. Thanks

Java RegEx combine patterns in any form

I'm trying to match some legal documents links. I've gone fare enough but I think I'm missing something. This is my work for now:
(\d( )?)?(([[a-zA-Z]\.])+?) ([0-9]+?)\b:([0-9]+?)?\b
I have a base construction witch I can match:
? = optional
number/space?/string/space/number/:/number
But now I want to optionally match any combination of the fallowing:
-/number
,/space/number
,/space/number/-/number
This is my best match:
(\d( )?)?(([[a-zA-Z]\.])+?) ([0-9]+?)\b:([0-9]+?)(, [0-9]+?)?(-[0-9]+?)?(, ([0-9]+?)-([0-9]+?)?)?\b
I can match this:
8 Law 84:145, 252-320
But not this:
8 Law 84:145, 252-320, 458, 517-665

You may use
(\d+)\s*([a-zA-Z]+)\s+(\d+):(\d+)((?:-\d+|,\s\d+(?:-\d+)?)*)
See the regex demo
The main part I added is ((?:-\d+|,\s\d+(?:-\d+)?)*) that matches and captures into a group 0 or more sequences of:
-\d+ - a hyphen and 1+ digits
| - or
,\s\d+(?:-\d+)? - comma, whitespace, 1+ digits, and then an optional sequence of - and 1+ digits.
Do not forget to double backslashes in the Java string literal inside the code.

Regular Expression that matches number with max 2 decimal places

I'm writing a simple code in java/android.
I want to create regex that matches:
0
123
123,1
123,44
and slice everything after second digit after comma.
My first idea is to do something like that:
^\d+(?(?=\,{1}$)|\,\d{1,2})
^ - from begin
\d+ match all digits
?=\,{1}$ and if you get comma at the end
do nothin
else grab two more digits after comma
but it doesn't match numbers without comma; and I don't understand what is wrong with the regex.

You may use
^(\d+(?:,\d{1,2})?).*
and replace with $1. See the regex demo.
Details:
^ - start of string
-(\d+(?:,\d{1,2})?) - Capturing group 1 matching:
\d+ - one or more digits
(?:,\d{1,2})? - an optional sequence of:
, - a comma
\d{1,2} - 1 or 2 digits
.* - the rest of the line that is matched and not captured, and thus will be removed.

basic regex : [0-9]+[, ]*[0-9]+
In case you want to specify min max length use:
[0-9]{1,3}[, ]*[0-9]{0,2}

Here:
,{1}
says: exactly ONE ","
Try:
,{0,1}
for example.

Can you fix this Java Regex to match currency such as -10 USD, 12.35 AUD ... (Java)?

I have a need to validate the Currency String as followings:
1. The Currency Unit must be in Uppercase and must contain 3 characters from A to Z
2. The number can contain negative (-) or positive (+) sign.
3. The number can contain the decimal fraction, but if the number contain
the decimal fraction then the fraction must be 2 Decimal only.
4. There is no space in the number part
So see this example:
10 USD ------> match
+10 USD ------> match
-10 USD ------> match
10.23 AUD ------> match
-12.11 FRC ------> match
- 11.11 USD ------> NOT match because there is space between negative sign and the number
10 AUD ------> NOT match because there is 2 spaces between the number and currency unit
135.1 AUD ------> NOT match because there is only 1 Decimal in the fraction
126.33 YE ------> NOT match because the currency unit must contain 3 Uppercase characters
So here is what I tried but failed
if(text != null && text.matches("^[+-]\\d+[\\.\\d{2}] [A-Z]{3}$")){
return true;
}
The "^\\d+ [A-Z]{3}$" only match number without any sign and decimal part.
So Can you fix this Java Regex to match currency that meets the above requirements?
Some other questions in the internet do not match my requirements.

It seems you don't know about ? quantifier which means that element which this quantifier describes can appear zero times or once, making it optional.
So to say that string can contain optional - or + at start just add [-+]?.
To say that it can contain optional decimal part in form .XX where X would be digit just add (\\.\\d{2})?
So try with "^[-+]?\\d+(\\.\\d{2})? [A-Z]{3}$"
BTW If you are using yourString.matches(regex) then you don't have to add ^ or $ to regex. This method will match only if entire string will match regex so these metacharacters are not necessary.
BTW2 Normally you should escape - in character class [...] because it represents range of characters like [A-Z] but in this case - can't be used this way because it is at start of character class so there is no "first" range character, so you don't have to escape - here. Same goes if - is last character in [..-]. Here it also can't represent range so it is simple literal.

Try with:
text.matches("[+-]?\\d+(\\.\\d\\d)? [A-Z]{3}")
Note that since you use .matches(), the regex is automatically anchored (blame the Java API desingers for that: .matches() is woefully misnamed)

you could start your regex with
^(\\+|\\-)?
Which means that it will accept either one + sign, one - sign or nothing at all before the digit. But that's only one of your problems.
Now the decimal point:
"3. The number can contain the decimal fraction, but if the number contain
the decimal fraction then the fraction must be 2 Decimal only."
so after the digit \\d+ the next part should be in ( )? to indicate that it is optional (meaning 1 time or never). So either there are exactly one dot and two digits or nothing
(\\.\\d{2})?
Here you can find a reference for regex and test them. Just have a look at what else you could use to identify the 3 Letters for the currency. E.g. the \s could help you to identify a whitespace

This will match all your cases:
^[-+]?\d+(\.\d{2})?\s[A-Z]{3}$
(Demo # regex101)
To use it in Java you have to escape the \:
text.matches("^[-+]?\\d+(\\.\\d{2})?\\s[A-Z]{3}$")
Your regex wasn't far from the goal, but it contains several mistakes.
The most important one is: [] denotes a character class while () is a capturing group. So when you specify a character group like [\\.\\d{2}] it will match on the characters \,.,d,{,2, and}, while you want to match on the pattern .\d{2}.
The other answers already taught you the ? quantifier, so I won't repeat this.
On a sidenote: regular-expressions.info is a great source to learn these things!
Explanation of the regex used above:
^ #start of the string/line
[-+]? #optionally a - or a + (but not both; only one character)
\d+ #one or more numbers
( #start of optional capturing group
\.\d{2} #the character . followed by exactly two numbers (everything optional)
)? #end of optional capturing group
\s #a whitespace
[A-Z]{3} #three characters in the range from A-Z (no lowercase)
$ #end of the string/line

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

regex for optional characters - java

I am using the following regex: ^([W|w][P|p]|[0-9]){8}$ The above regex accepts wp1234567 (wp+7 digits) also. Whereas expected: WP+6digit or wp+6digit or only 8 digit For example: WP123456 wp126456 64535353

Related

Java Regex Troubles

Java - Regex - Allow 0-9, periods, hypen

Java RegEx combine patterns in any form

Regular Expression that matches number with max 2 decimal places

Can you fix this Java Regex to match currency such as -10 USD, 12.35 AUD ... (Java)?

Categories

Resources