String validation in java using regex

String validation in java using regex - java

I have to validate a set of strings and do stuff with it. The acceptable formats are :
1/2
12/1/3
1/23/333/4
The code used for validation is:
if (str.matches("(\\d+\\/|\\d+){2,4}")) {
// do some stuff
} else {
// do other stuff
}
But it will match any integer with or without slashes, I want to exclude ones without slashes.. How can I match only the valid patterns?

It looks like you want to find number (series of one or more digits - \d+) with one or more /number after it. If that is the case then you can write your regex as
\\d+(/\\d+)+

You can try
(\d+/){1,3}\d+
digits followed by / one to three times----^^^^^^ ^^------followed by digit
Sample code:
System.out.println("1/23/333/4".matches("(\\d+/){1,3}\\d+")); // true
System.out.println("1/2".matches("(\\d+/){1,3}\\d+")); // true
System.out.println("12/1/3".matches("(\\d+/){1,3}\\d+")); // true
Pattern explanation:
( group and capture to \1 (between 1 and 3 times):
\d+ digits (0-9) (1 or more times)
/ '/'
){1,3} end of \1
\d+ digits (0-9) (1 or more times )

\\b\\d+(/\\d+){1, 3}\\b
\b is a word boundary. This will match all tokens with 1-3 slashes, with the slashes surrounded by digits and the token surrounded by word boundaries.

Related

How to create regex expression for 3 links at once

I created regex expression in JAVA for 2 links at once:
https://downloads.test.test.testagain.tes/test-test/test/te25st24w/te43s5t25x/0twt42ts/test0218.pdf
https://downloads.test.test.testagain.tes/test-test/test/te25st24w/te43s5t25x/0twt42ts/TestTes-09-05-2018.pdf
Regex:
String REGEX_LINK = "https:..downloads.test.test.testagain.tes.test-test.test."
Pattern pattern = Pattern.compile( REGEX_LINK + ".[\w*/]*.((\d{2}-\d{2}-)?\d{4}).pdf" );
But I have to create regex expression for 3 links at once and I don't know how to do that, I need help with this:
https://downloads.test.test.testagain.tes/test-test/test/te25st24w/te43s5t25x/0twt42ts/test0218.pdf
https://downloads.test.test.testagain.tes/test-test/test/te25st24w/te43s5t25x/0twt42ts/TestTes-09-05-2018.pdf
https://downloads.test.test.testagain.tes/test-test/test/te25st24w/te43s5t25x/0twt42ts/01-01-18_Testt_Testing_ASB_Test_Final.pdf
I have to create one regex expression to extract String from 1 link: "0218", from 2 link: "09-05-2018", from 3 link: "01-01-18"
Maybe someone has a any idea how to do this?

You could match 2 times 2 digits with an optional hyphen, and then optionally 4 or 2 digits preceded by a hyphen.
Note that the pattern by itself does not verify a valid date.
(?<!\d)(\d{2}-?\d{2}(?:-(?:\d{4}|\d{2}))?)\S*\.pdf\b
Explanation
(?<!\d) Negative lookbehind, assert not a digit to the left
( Capture group 1
\d{2}-?\d{2} Match 2 digits, optional hyphen and 2 digits
(?:-(?:\d{4}|\d{2}))? Optionally match - and either 4 or 2 digits
) Close group 1
\S* Match optional non whitespace chars
\.pdf\b Match a dot and pdf followed by a word boundary
Regex demo
Or if there can not be any other digits following till the end of the string:
(?<!\d)(\d{2}-?\d{2}(?:-(?:\d{4}|\d{2}))?)[^\d\s]*\.pdf\b
Regex demo

A period must not appear consecutively in a String Java

I have a code check if the user input is valid in the regular expression pattern. The patter is # the problem is how to check if the character . appears consecutively
[a-z|A-Z|0-9|[.]{1}]+#[[a-z|A-Z|0-9]+
i've tried this patter so far.
System.out.print("Enter your Email: ");
String userInput = new Scanner(System.in).nextLine();
Pattern pat = Pattern.compile("[a-z|A-Z|0-9|[.]{1}]+#[a-z|A-Z|0-9]+");
Matcher mat = pat.matcher(userInput);
if(mat.matches()){
System.out.print("Valid");
}else{
System.out.print("Invalid");
}
}
}
if the input is een..123#asd123
I expect the output will Invalid but if the input is een.123#asd123 the output will Valid

A character class matches any of the listed characters. If you specify a pipe | that does not mean OR but it could then also match a |.
If you don't want to match consecutive dots, you could make use of a character class that does not contain a dot, and then use a quantifier to repeat a grouping structure that does start with a dot.
^[a-zA-Z0-9]+(?:\.[a-zA-Z0-9]+)*#[a-zA-Z0-9]+(?:\.[a-zA-Z0-9]+)*$
That will match
^ Start of string
[a-zA-Z0-9]+ Match 1+ times any character that is listed in the charater class
(?:\.[a-zA-Z0-9]+)* Repeat 0+ times a group which starts with a dot and matches 1+ times what is listed in the character class to prevent consecutive dots.
# Match # char
[a-zA-Z0-9]+ Match again 1+ chars
(?:\.[a-zA-Z0-9]+)* Match again repeating group
$ End of string
Regex101 demo

If you don't want consecutive periods, use [a-zA-Z0-9]+(?:\.[a-zA-Z0-9]+)*
Explanation
[a-zA-Z0-9]+ Match one or more letters or digits
(?: Start non-capturing group
\. Match exactly one period
[a-zA-Z0-9]+ Match one or more letters or digits
)* End optional repeating group
With this pattern, the value cannot start or end with a period, and cannot have consecutive periods.
Alternatively, use a negative lookahead: (?!.*[.]{2})[a-zA-Z0-9.]+#[a-zA-Z0-9]+
Explanation
(?!.*[.]{2}) Fail match if 2 consecutive periods are found
[a-zA-Z0-9.]+#[a-zA-Z0-9]+ Match normally

Java Regex validation for list of numbers

I'm a newbie in java regex. i am seeking for advise for this series of number checking :
Number, must be >= 10 digits, user is not allowed to input as follows:
"0000000000","1111111111","2222222222","3333333333","4444444444",
"5555555555","6666666666","7777777777","8888888888","9999999999",
"1234567890","00000000000","11111111111","22222222222","33333333333",
"44444444444","55555555555","66666666666","77777777777","88888888888",
"99999999999"
currently my regex pattern is something like this
^(?=\\d{8,11}$)(?:(.)\\1*)$
this validates all numbers in the series except the 1234567890. any advise is appreciated. Thank you.

Use this:
^(?!(\d)\1+\b|1234567890)\d{10,}$
See what matches and fails in the Regex Demo.
To validate in Java, with matches we don't need the anchors:
if (subjectString.matches("(?!(\\d)\\1+\\b|1234567890)\\d{10,}")) {
// It matched!
}
else { // nah, it didn't match...
}
Explanation
The negative lookahead (?!(\d)\1+\b|1234567890) asserts that what follows is not...
(\d)\1+\b one digit (captured to Group 1), follows by repetitions of what was matched by Group 1, then a word boundary
OR |
1234567890
\d{10,} matches ten or more digits

Number, must be >= 10 digits, user is not allowed to input as follows.
You can use String.matches() method to check for any match.
Try below regex that checks for possible inputs as suggested by you. add more as per your need.
1234567890|(\d)\1{9}
Here is Live demo
Pattern explanation:
1234567890 '1234567890'
| OR
( group and capture to \1:
\d digits (0-9)
) end of \1
\1{9} what was matched by capture \1 (9 times)
Sample code:
String regex ="1234567890|(\\d)\\1{9}";
System.out.println("0000000000".matches(regex)); // true
System.out.println("1234567890".matches(regex)); // true
System.out.println("1111111111".matches(regex)); // true

Can you fix this Java Regex to match currency such as -10 USD, 12.35 AUD ... (Java)?

I have a need to validate the Currency String as followings:
1. The Currency Unit must be in Uppercase and must contain 3 characters from A to Z
2. The number can contain negative (-) or positive (+) sign.
3. The number can contain the decimal fraction, but if the number contain
the decimal fraction then the fraction must be 2 Decimal only.
4. There is no space in the number part
So see this example:
10 USD ------> match
+10 USD ------> match
-10 USD ------> match
10.23 AUD ------> match
-12.11 FRC ------> match
- 11.11 USD ------> NOT match because there is space between negative sign and the number
10 AUD ------> NOT match because there is 2 spaces between the number and currency unit
135.1 AUD ------> NOT match because there is only 1 Decimal in the fraction
126.33 YE ------> NOT match because the currency unit must contain 3 Uppercase characters
So here is what I tried but failed
if(text != null && text.matches("^[+-]\\d+[\\.\\d{2}] [A-Z]{3}$")){
return true;
}
The "^\\d+ [A-Z]{3}$" only match number without any sign and decimal part.
So Can you fix this Java Regex to match currency that meets the above requirements?
Some other questions in the internet do not match my requirements.

It seems you don't know about ? quantifier which means that element which this quantifier describes can appear zero times or once, making it optional.
So to say that string can contain optional - or + at start just add [-+]?.
To say that it can contain optional decimal part in form .XX where X would be digit just add (\\.\\d{2})?
So try with "^[-+]?\\d+(\\.\\d{2})? [A-Z]{3}$"
BTW If you are using yourString.matches(regex) then you don't have to add ^ or $ to regex. This method will match only if entire string will match regex so these metacharacters are not necessary.
BTW2 Normally you should escape - in character class [...] because it represents range of characters like [A-Z] but in this case - can't be used this way because it is at start of character class so there is no "first" range character, so you don't have to escape - here. Same goes if - is last character in [..-]. Here it also can't represent range so it is simple literal.

Try with:
text.matches("[+-]?\\d+(\\.\\d\\d)? [A-Z]{3}")
Note that since you use .matches(), the regex is automatically anchored (blame the Java API desingers for that: .matches() is woefully misnamed)

you could start your regex with
^(\\+|\\-)?
Which means that it will accept either one + sign, one - sign or nothing at all before the digit. But that's only one of your problems.
Now the decimal point:
"3. The number can contain the decimal fraction, but if the number contain
the decimal fraction then the fraction must be 2 Decimal only."
so after the digit \\d+ the next part should be in ( )? to indicate that it is optional (meaning 1 time or never). So either there are exactly one dot and two digits or nothing
(\\.\\d{2})?
Here you can find a reference for regex and test them. Just have a look at what else you could use to identify the 3 Letters for the currency. E.g. the \s could help you to identify a whitespace

This will match all your cases:
^[-+]?\d+(\.\d{2})?\s[A-Z]{3}$
(Demo # regex101)
To use it in Java you have to escape the \:
text.matches("^[-+]?\\d+(\\.\\d{2})?\\s[A-Z]{3}$")
Your regex wasn't far from the goal, but it contains several mistakes.
The most important one is: [] denotes a character class while () is a capturing group. So when you specify a character group like [\\.\\d{2}] it will match on the characters \,.,d,{,2, and}, while you want to match on the pattern .\d{2}.
The other answers already taught you the ? quantifier, so I won't repeat this.
On a sidenote: regular-expressions.info is a great source to learn these things!
Explanation of the regex used above:
^ #start of the string/line
[-+]? #optionally a - or a + (but not both; only one character)
\d+ #one or more numbers
( #start of optional capturing group
\.\d{2} #the character . followed by exactly two numbers (everything optional)
)? #end of optional capturing group
\s #a whitespace
[A-Z]{3} #three characters in the range from A-Z (no lowercase)
$ #end of the string/line

How to build a regular expression for a string?

How can i write this as a regular expression?
"blocka#123#456"
i have used # symbol to split the parameters in the data
and the parameters are block name,startX coordinate,start Y corrdinate
this is the data embedded in my QR code.so when i scan the QR i want to check if its the right QR they're scanning. For that i need a regular expression for the above syntax.
my method body
public void Store_QR(String qr){
if( qr.matches(regular Expression here)) {
CurrentLocation = qr;
}
else // Break the operation
}

The Information you specified does not justice using a regular expression at all.
Try to from it in a more general way.
If you really need to scan for "blocka#123#456" then use qr.contains("blocka#123#456");

It depends on what you want to match.
Here are some regex propositions:
^blocka#[0-9]{3}#[0-9]{3}$
^blocka#[0-9]+#[0-9]+$
^blocka(#[0-9]{3}){2}$
^blocka(#[0-9]+){2}$
^blocka(#[0-9]{3})+$
^blocka(#[0-9]+)+$
Otherwise, just use contains() or similar.

myregexp.com is nice to do some testing.
Official Java Regex Tutorial is quite ok to learn and includes most things one needs to know.
The Pattern documentation also includes fancy predefined character classes that are missing in above tutorial.
You did not specify anything that has to be regular in that example you gave. Regular expressions make only sense if there are rules to validate the input.
If it has to be exactly "blocka#123#456" then "blocka#123#456" or "^blocka#123#456$" will work as regex. Stuff between ^ and $ means that the regex inside must span from begin to end of the input. Sometimes required and usually a good idea to put that around your regex.
If blocka is dynamic replace it with [a-z]+ to match any sequence of lowercase letters a through z with length of at least 1. block[a-z] would match blocka, blockb, etc.
And [a-z]{6} would match any sequence of exactly 6 letters. [a-zA-Z] also includes uppercase letters and \p{L} matches any letter including unicode stuff (e.g. Blüc本).
# matches #. Like any character without special regex meaning ( \ ^ $ . | ? * + ( ) [ ] { } ) characters match themselves. [^#] matches every character but #.
Regarding the numbers: [0-9]+ or \d+ is a generic pattern for several numbers, [0-9]{1,4} would match anything consisting out of 1-4 numbers like 007, 5, 9999. (?:0|[1-9][0-9]{0,3}) for example will only match numbers between 0 and 9999 and does not allow leading zeros. (?:STUFF) is a non-capturing group that does not affect the groups you can extract via Matcher#group(1..?). Useful for logical grouping with |. The meaning of (?:0|[1-9][0-9]{0,3}) is: either a single 0 OR ( 1x 1-9 followed by 0 to 3 x 0-9).
[0-9] is so common that there is a predefinition for it : \d (digit). It's \\d inside the regex String since you have to escape the \.
So some of your options are
".*" which matches absolutely everything
"^[^#]+(?:#[^#]+)+$" which matches anything separated by # like "hello #world!1# -12.f #本#foo#bar"
"^blocka(#\\d+)+$" which matches blocka followed by at least one group of numbers separated by # e.g. blocka#1#12#0007#949432149#3
"^blocka#(?:[0-9]|[1-9][0-9]|[1-3][0-9]{2})#[4-9][0-9]{2}$" which will match only if it finds blocka# followed by numbers 0 - 399, followed by a # and finally numbers 400-999
"^blocka#123#456$" which matches only exactly that string.
All that are regular expressions that match the example you gave.
But it's probably as simple as
public void Store_QR(String qr){
if( qr.matches("^blocka#\\d+#\\d+$")) {
CurrentLocation = qr;
}
else // Break the operation
}
or
private static final Pattern QR_PATTERN = Pattern.compile("^blocka#(\\d+)#(\\d+)$");
public void Store_QR(String qr){
Matcher matcher = QR_PATTERN.matcher(qr);
if(matcher.matches()) {
int number1 = Integer.valueOf(matcher.group(1));
int number2 = Integer.valueOf(matcher.group(2));
CurrentLocation = qr;
}
else // Break the operation
}
BlockName#start_X#start_Y any block name.. starting with the string"block" and followed by two integers
I guess a good regex for that would be "^block\\w+#\\d+#\\d+$", starting with "block", then any combination of a-z, A-Z, 0-9 and _ (thats the \w) followed by #, numbers, #, numbers.
Would match block_#0#0, blockZ#9#9, block_a_Unicorn666#0000#1234, but not block#1#2 because there is no name at all and would not match blockName#123#abc because letters instead of number. Would also not match Block_a#123#456 because of the uppercase B.
If the name part (\\w+) is too liberal (___, _123 would be a legal names) use e.g. "^block_?[a-zA-Z]+#\\d+#\\d+$", what won't allow numbers and names may only be separated by a single optional _ and there have to be letters after that. Would allow _a, a, _ABc, but not _, _a_b, _a9. If you want to allow numbers in names [a-zA-Z0-9] would be the character class to use.

I suggest:
[a-z]+#\d+#\d+
And if you want capture the 3 parts:
([a-z]+)#(\d+)#(\d+)
Matcher.group( 1, 2 or 3 ) returns the parts

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

String validation in java using regex - java

It looks like you want to find number (series of one or more digits - \d+) with one or more /number after it. If that is the case then you can write your regex as \\d+(/\\d+)+

\\b\\d+(/\\d+){1, 3}\\b \b is a word boundary. This will match all tokens with 1-3 slashes, with the slashes surrounded by digits and the token surrounded by word boundaries.

Related

How to create regex expression for 3 links at once

A period must not appear consecutively in a String Java

Java Regex validation for list of numbers

Can you fix this Java Regex to match currency such as -10 USD, 12.35 AUD ... (Java)?

How to build a regular expression for a string?

Categories

Resources