Pattern string matching between 0-1 and 4 - java

My string could be in the form like this:
"A3C10" or "A3B00" or "A3F90".
I want to return true if the string contains "A3" in the first two substring and "0" in the last substring index. Is there a way to write the regex pattern String matching here?

You can use regex for that:
string.matches("A3.*0");
It returns true, if string begins with "A3" and ends with "0".

Following on from Andronicus' answer, .* will match any sequence, so as long as it begins with A3 and ends with 0 it will return true.
If you want to match the exact pattern of A3XX0 where X is any character, then use the below pattern.
string.matches("A3..0");

Related

Java Regex ? (expr){num} confusion?

I'm trying to identify strings which contain exactly one integer.
That is exactly one string of contiguous digits e.g. "1234" (no dots, no commas).
So I thought this should do it: (This is with the Java String Escapes included):
(\\d+){1,}
So the "\d+" correctly a string of contiguous digits. (right?)
I included this expression as a sub-expression within "(" and ")" and then I'm trying to say "only one of these sub-expressions.
Here's the result of ( matcher.find() ) of checking various strings:
(note the regex from now on is'raw' here - NOT Java String Escaped).
Pattern:(\d+){1,}
Input String Result
1 true
XX-1234 true
do-not-match-no-integers false
do-not-match-1234-567 true
do-not-match-123-456 true
It seems the '1' in the pattern is applying to the "+\d" string, rather than the number of those contiguous strings.
Because if I change the number from 1 to 4; I can see the result change to the following:
Pattern:(\d+){4,}
Input String Result
1 false
XX-1234 true
do-not-match-no-integers false
do-not-match-1234-567 true
do-not-match-123-456 false
What am I missing here ?
Out of interest - if I take off the "(" and ")" altogether - I'm getting a different result again
Pattern:\d+{4,}
Input String Result
1 true
XX-1234 true
do-not-match-no-integers false
do-not-match-1234-567 true
do-not-match-123-456 true
Matcher.find() will try to find a match inside the String. You should try Matcher.matches() instead to see if the pattern fits in all the string.
In this way, the pattern you need is \d+
EDIT:
Seems that I misunderstood the question. One way to find if the String has only one integer, using the same pattern is:
int matchCounter = 0;
while (Matcher.find() || matchCounter < 2){
matchCounter++;
}
return matchCounter == 1
This is the regex:
^[^\d]*\d+[^\d]*$
That's zero or more non digits, followed by a substring of digits and then zero or more non digits again until the end of the string. Here is the java code (with escaped slashes):
class MainClass {
public static void main(String[] args) {
String regex="^[^\\d]*\\d+[^\\d]*$";
System.out.println("1".matches(regex)); // true
System.out.println("XX-1234".matches(regex)); // true
System.out.println("XX-1234-YY".matches(regex)); // true
System.out.println("do-not-match-no-integers".matches(regex)); // false
System.out.println("do-not-match-1234-567".matches(regex)); // false
System.out.println("do-not-match-123-456".matches(regex)); // false
}
}
You can use the RegEx ^\D*?(\d+)\D*?$
^\D*? makes sure there is no digits between the start of your line and your first group
(\d+) matches your digits
\D*?$ makes sure there is no digits between the your first group and the end of your line
Demo.
So, for your Java String, it would be : ^\\D*?(\\d+)\\D*?$
I think you will have to make sure your regex considers the entire string, using ^ and $.
To do that, you could match zero or more non-digits, followed by 1 or more digits, and then zero or more non-digits.
The following should do the trick:
^[^\d]*(\d+)[^\d]*$
Here it is on regex101.com: https://regex101.com/r/CG0RiL/2
Edit: As pointed out by Veselin Davidov my regex isn't correct.
If i understand you right you want it only to say true when the entire String matches the pattern. yes?
Then you have to call matcher.matches();
Also i think your pattern must be just \d+.
If you have problem with regex i can recommend you https://regex101.com/ it explains you why it matches something and gives you a quick preview.
I use it every time i have to write regex.

Extract substring from end till first alphabet in java

I have a string of format: A-2-Q4567
More examples: AB-456-T12, A24-5-M12345, etc.
I want to extract the last numerical values out of these strings, which are: 4567, 12, 12345 respectively (which is the numerical value of the substring from the end till first non-numeric character is encountered)
I can split the string, get the last string from the splitted string array, and then do a parseInt after removing the non-numerical characters from it.
But is there a more elegant way of doing this?
You can use this regex: (\d+$). It returns the last sequence of digits in the string.
EDIT - some explanation:
The \d means any digit.
The + means one or more of the previous symbols. Since the previous symbol is a digit, then \d+ means "one or more digits".
The $ means the end of the string, so \d+$ is the last sequence of digits in the string.
you can do this :
String getLastNumeric(String input)
{
String str="";
char c;
for(int i=input.length()-1;i>=0 && Character.isDigit(c=input.charAt(i));i--)
str=c+str;
return str;
}
The regex solutions might be more elegant but performance-wise I think the above is the best because Regex match can be more expensive than a simple for loop with a simple condition to evaluate.
Ofcourse The Regex is more flexible, what if your requirements change and now a dash "-" must precede the numbers ? with Regex it should be just a matter of changing one regex expression.
I put the Regex version here but remember if you're sure your requirements won't change I think the above solution is better on the CPU :
Matcher matcher= Pattern.compile("(\\d+$)").matcher(input);
if(matcher.find())
return matcher.group();
return "";

String split method returning first element as empty using regex

I'm trying to get the digits from the expression [1..1], using Java's split method. I'm using the regex expression ^\\[|\\.{2}|\\]$ inside split. But the split method returning me String array with first value as empty, and then "1" inside index 1 and 2 respectively. Could anyone please tell me what's wrong I'm doing in this regex expression, so that I only get the digits in the returned String array from split method?
You should use matching. Change your expression to:
`^\[(.*?)\.\.(.*)\]$`
And get your results from the two captured groups.
As for why split acts this way, it's simple: you asked it to split on the [ character, but there's still an "empty string" between the start of the string and the first [ character.
Your regex is matching [ and .. and ]. Thus it will split at this occurrences.
You should not use a split but match each number in your string using regex.
You've set it up such that [, ] and .. are delimiters. Split will return an empty first index because the first character in your string [1..1] is a delimiter. I would strip delimiters from the front and end of your string, as suggested here.
So, something like
input.replaceFirst("^[", "").split("^\\[|\\.{2}|\\]$");
Or, use regex and regex groups (such as the other answers in this question) more directly rather than through split.
Why not use a regex to capture the numbers? This will be more effective less error prone. In that case the regex looks like:
^\[(\d+)\.{2}(\d+)\]$
And you can capture them with:
Pattern pat = Pattern.compile("^\\[(\\d+)\\.{2}(\\d+)\\]$");
Matcher matcher = pattern.matcher(text);
if(matcher.find()) { //we've found a match
int range_from = Integer.parseInt(matcher.group(1));
int range_to = Integer.parseInt(matcher.group(2));
}
with range_from and range_to the integers you can no work with.
The advantage is that the pattern will fail on strings that make not much sense like ..3[4, etc.

Why won't this string regex match?

I have a string and a simple pattern (a string with a wildcard). When I use the match function I would it expect it to return true for my text, but it doesn't it returns false.
String text = "test_1_2_3";
String pattern = "test_*"
text.matches(pattern);//this returns false
_* will matches the character _ literally between zero and more times ,instead you need .* that match any character between zero and more times:
"test_.*"
Demo
pattern = "test_*" means "test" and 0 or more "_"
Because your test_* pattern, combined with Matcher#matches, will match a whole input (i.e. from start to end), that matches the following conditions:
starts with test
followed by (and ending with) 0 instance of _, or more (greedy-quantified here).
Using Matcher#find would return true in this case, since it would match a partial test_.
So, your matches invocation would return true with the given Pattern, with inputs such as:
test_
test__
... and so on.
See API.
Your regexp will match test followed by zero or more '_' character.
I think you want this:
String text = "test_1_2_3";
String pattern = "test_.*";

Why does "3.5".matches("[0-9]+") return false?

I use the method String.matches(String regex) to find if a string matches the regex expression
From my point of view the regular expression regex="[0-9]+" means a String that contains at least one figure between 0 and 9
But when I debug "3.5".matches("[0-9]+") it returns false.
So what is wrong ?
matches determines if the regex matches the whole string. It won't return true if the string contains a match.
To test if the string contains a match to a given regex, use Pattern.compile(regex).matcher(string).find().
(Your regex, [0-9]+, will match any string that contains only digits from 0 to 9, and at least one digit. It doesn't magically match against any real number. If you want something matching any real number, look at e.g. the Javadoc for Double.valueOf(String), which specifies a regex used in validating doubles. That regex allows hexadecimal input, NaNs, and infinities, but it should give you a better idea of what's required.)
Alternately, edit the regex so it directly matches any string containing one or more digits, e.g. .*[0-9]+.* would do the job.
If you want to match decimal numbers, your reg ex needs to be \d*\.?\d+. If you want negatives as well, then \-?\d*\.?\d+.
. is not 0-9 and matches tests the entire string.

Categories