regex to match a recurring pattern - java

I am trying to write a regex for java that will match the following string:
number,number,number (it could be this simple or it could have a variable number of numbers, but each number has to have a comma after it there will not be any white space though)
here was my attempt:
[[0-9],[0-9]]+
but it seems to match anything with a number in it

You could try something along the lines of ([0-9]+,)*[0-9]+
This will match:
Only one number, e.g.: 7
Two numbers, e.g.: 7,52
Three numbers, e.g.: 7,52,999
etc.
This will not match:
Things with spaces, e.g.: 7, 52
A list ending with a comma, e.g.: 7, 52,
Many other things out of the scope of this problem.

I think this would work
\d+,(\d+,)+
Note that as you want, that will only capture number followed by a comma

I guess you are starting with a String. Why don't you just use String.split(",") ?

^ means the start of a string and $ means the end. If you don't use those, you could match something in the middle (b matched "abc").
The + works on the element before it. b is an element, [0-9] is an element, and so are groups (things wrapped in parenthesis).
So, the regex you want matches:
The start of the string ^
a number [0-9]
any amount of comas flowed by numbers (,[0-9])+
the end of the string $
or, ^[0-9](,[0-9])+$

Try regex as [\d,]* string representation as [\\d,]* e.g. below:
Pattern p4 = Pattern.compile("[\\d,]*");
Matcher m4 = p4.matcher("12,1212,1212ad,v");
System.out.println(m4.find()); //prints true
System.out.println(m4.group());//prints 12,1212,1212
If you want to match minimum one comma (,) and two numbers e.g. 12,1212 then you may want to use regex as (\d+,)+\d+ with string representation as \\d+,)+\\d+. This regex matches a a region with a number minimum one digit followed by one comma(,) followed by minimum one digit number.

Related

minimum number in a string should be 1 regex validation?

I have a String which I need to match. Meaning it should only contains a number followed by space or just a number and minimum number should be 1 always. For ex:
3 1 2
1 p 3
6 3 2
0 3 2
First and third are valid string and all other are not.
I came up with below regex but I am not sure how can I check for minimum number in that string should be 1 always?
str.matches("(\\d|\\s)+")
Regex used from here
Just replace \\d with [1-9].
\\d is just a shorthand for the class [0-9].
This is a better regex though: ([1-9]\\s)*[1-9]$, as it takes care of double digit issues and won't allow space at the end.
Not everything can or should be solved with regular expressions.
You could use a simple expression like
str.matches("((\\d+)\\s)+")
or something alike to simply check that your input line contains only groups of digits followed by one or more spaces.
If that matches, you split along the spaces and for each group of digits you turn it into a number and validate against the valid range.
I have a gut feeling that regular expressions are actually not sufficient for the kind of validation you need.
If it should only contains a number followed by space or just a number and minimum number should be 1 and number can also be larger than 10 you might use:
^[1-9]\\d*(?: [1-9]\\d*)*$
Note that if you want to match a space only, instead of using \s which matches more you could just add a space in the pattern.
Explanation
^ Assert the start of the string
[1-9]\\d* Match a number from 1 up
(?: [1-9]\\d*)* Repeat a number from 1 up with a prepended space
$ Assert end of the string
Regex demo
Regex is part of the solution. But I don't think that regex alone can solve your problem.
This is my proposed solution:
private static boolean isValid(String str) {
Pattern pattern = Pattern.compile("[(\\d+)\\s]+");
Matcher matcher = pattern.matcher(str);
return matcher.matches() && Arrays.stream(Arrays.stream(matcher.group().split(" "))
.mapToInt(Integer::parseInt)
.toArray()).min().getAsInt() == 1;
}
Pay attention to the mathing type: matcher.matches() - to check match against the entire input. (don't use matcher.find() - because it will not reject invalid input such as "1 p 2")

What is the regex for finding a of a piece of text in a particular format

What is the regex for finding if a piece of text is in a particulate format?
Format should follow:
AAAA-123 or AAAA123 (with or without the dash)
Where the first 4 characters are letters in the range A-M and the following 3 characters are numbers with a max of 299.
Example:
ABCD-299 would match
and
ABZR-301 would not match
[A-M]{4}-?[0-2][0-9]{2}
Basically:
[A-M]{4} = 4 of any letters A-M
-? = an optional dash
[0-2] = a single 0,1, or 2
[0-9]{2} = two of any number
limiting the first number to 0-2 effectively limits your number to 299, and allows for 000-299
i'm not sure if you are searching for this in a string or checking that a string equals exactly this... and that context might change how you use the above. for example, if you are testing a string you'll want to wrap it with ^ and $
^[A-M]{4}-?[0-2][0-9]{2}$
^ means beginning of string
[] define a group of potential matches. In this case uppercase A all the way to uppercase M (hyphen is a special char when within [] to denote a range) (note the range utilizes ascii http://www.asciitable.com/ so if you did A-z it would include all those non alphanumeric characters between.
{} define count. in this case exactly 4. you can define a range like {1,3} which means 1 to 3, or {,7} at most 7, or {5,} at least 5
and the ? means that the char before is may or may not be there. in this case the hyphen.
$ means end of string
the ^ and $ are necessary i think. otherwise that regex will match AAAAAAAA-2342347474
anyways, read up on regex. they can be powerful and fun. http://regexr.com/

Regex for multiple instances of character

In Java, using a regular expression, how would I check a string to see if it had a correct amount of instances of a character.
For example take the string hello.world.hello:world:. How could this string be checked to see if it contained two instances of a . or two instances of a :?
I have tried
Pattern p = Pattern.compile("[:]{2}");
Matcher m = p.matcher(hello.world.hello:world:);
m.find();
but that failed.
Edit
First I would like to say thank you for all the answers. I noticed a lot of the answers said something along the lines of "This means: zero or more non-colons, followed by a single colon, followed by zero or more non-colons - matched exactly twice". So if you were checking for 3 : in a string such as Hello::World: how would you do it?
Well, using matches you could use:
"([^:]*:[^:]*){2}"
This means: "zero or more non-colons, followed by a single colon, followed by zero or more non-colons - matched exactly twice".
Using find is not as good, as there may be additional : and it will just ignore them.
You can use this regex based on two lookaheads assertions:
^(?=(?:[^.]*\.){2}[^.]*$)(?=(?:[^:]*:){2}[^:]*$)
(?=(?:[^.]*\.){2}[^.]*$) makes sure there are exactly 2 DOTS and (?=(?:[^:]*:){2}[^:]*$) asserts that there are exactly 2 colons in input string.
RegEx Demo
You can determine whether the string has exectly the given number of a certain character, say ':', by attempting to match it against a pattern of this form:
^(?:[^:]*[:]){2}[^:]*$
That says exactly two non-capturing groups consisting of any number (including zero) of characters other than ':' followed by one colon, with the second group followed by any number of additional characters other than ':'.

Match regex but only replace the first section - Java

I'm trying to take a phone number which can be in the format either +44 or +4 followed by any number of digits or hyphens, and replace the +44 or +4 with +44 or +4 followed by a space.
I believe I need a look around to match the full number but only replace the initial prefix, what I'm trying atm is
^[+]\d[0-9](?:([0-9]+))?
which matches the number (without hyphens) however I thought the lookahead would only match the number and not capture the extra digits however it seems to capture the whole thing.
Can anyone point me in the right direction as to what I've done wrong?
EDIT:
To be clearer my Java code is
Pattern pattern = Pattern.compile("^[+]\\d[0-9](?:([0-9]+))?");
if(pattern.matcher("+441234567890").matches())
String num = pattern.matcher(title).replaceFirst("$0 $1");
Thanks.
If you want to match whole number, but replace only part of it, you should not use positive lookahead, but just gruping, like in:
(^\+\d\d)([\d-]+)?
prefix will be in group 1, and the rest of number in group 2, so to add a space between these parts, just use something like group1 + space + group2.
In your example it should look like this:
Pattern pattern = Pattern.compile("(^\\+\\d\\d)([\\d-]+)?");
if(pattern.matcher("+441234567890").matches()) {
num = pattern.matcher(title).replaceFirst("$1 $2");
}
However this regex will always capture two digits in prefix, if you want to match +44 or +4 you should use:
(^\+(44|4))([\d-]+)?
so if you have more possible prefixes, you need to change this regex also.
You regex didn't work as you expected because (?:([0-9]+))? is a non capturing group, so the fragment matched by this part of regex was not captured, but it was still matched by whole regex. So $0 returned whole regex, and $1 should not return anything.

validate string in java

I have a string with data separated by commas like this:
$d4kjvdf,78953626,10.0,103007,0,132103.8945F,
I tried the following regex but it doesn't match the strings I want:
[a-zA-Z0-9]+\\,[a-zA-Z0-9]+\\,[a-zA-Z0-9]+\\,[a-zA-Z0-9]+\\,[a-zA-Z0-9]+\\,[a-zA-Z0-9]+\\,
The $ at the beginning of your data string is not matching the regex. Change the first character class to [$a-zA-Z0-9]. And a couple of the comma separated values contain a literal dot. [$.a-zA-Z0-9] would cover both cases. Also, it's probably a good idea to anchor the regex at the start and end by adding ^ and $ to the beginning and end of the regex respectively. How about this for the full regex:
^[$.a-zA-Z0-9]+\\,[$.a-zA-Z0-9]+\\,[$.a-zA-Z0-9]+\\,[$.a-zA-Z0-9]+\\,[$.a-zA-Z0-9]+\\,[$.a-zA-Z0-9]+\\,$
Update:
You said number of commas is your primary matching criteria. If there should be 6 commas, this would work:
^([^,]+,){6}$
That means: match at least 1 character that is anything but a comma, followed by a comma. And perform the aforementioned match 6 times consecutively. Note: your data must end with a trailing comma as is consistent with your sample data.
Well your regular expression is certainly jarbled - there are clearly characters (like $ and .) that your expression won't match, and you don't need to \\ escape ,s. Lets first describe our requirements, you seem to be saying a valid string is defined as:
A string consisting of 6 commas, with one or more characters before each one
We can represent that with the following pattern:
(?:[^,]+,){6}
This says match one or more non-commas, followed by a comma - [^,]+, - six times - {6}. The (?:...) notation is a non-capturing group, which lets us say match the whole sub-expression six times, without it, the {6} would only apply to the preceding character.
Alternately, we could use normal, capturing groups to let us select each individual section of the matching string:
([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),?
Now we can not only match the string, but extract its contents at the same time, e.g.:
String str = "$d4kjvdf,78953626,10.0,103007,0,132103.8945F,";
Pattern regex = Pattern.compile(
"([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),?");
Matcher m = regex.matcher(str);
if(m.matches()) {
for (int i = 1; i <= m.groupCount(); i++) {
System.out.println(m.group(i));
}
}
This prints:
$d4kjvdf
78953626
10.0
103007
0
132103.8945F

Categories