Regular expression 30 numbers + space + hypen + space+ 1 number - java

I have this code to find this pattern: 201409250200131738007947036000 - 1 ,inside the text
final String patternStr = "(\\d{30} - \\d{1})";
final Pattern p = Pattern.compile(patternStr);
final Matcher m = p.matcher(page);
if (m.matches()) {
System.out.println("SUCCESS");
}
But for any strange reasson in Java did't work, Can somebody help me where is the error please?

The reason is that the matches method checks for the entire given string to match the regex.
So i.e. if your string is 123456123412345612341234561234 - 8 it will match, if it is my number 123456123412345612341234561234 - 8 is inside other text it won't.
Use the find method to accomplish your task:
if (m.find()) {
System.out.println("SUCCESS");
}
It will search inside the given string instead of attempting to match the entire string.

From the documentation for Matcher, matches:
Attempts to match the entire region against the pattern.
As opposed to find which:
Attempts to find the next subsequence of the input sequence that matches the pattern.
So use matches to match an entire String against a pattern, use find to locate a pattern inside a String.
Try:
final String patternStr = "\\d{30}+\\s-\\s\\d";
final Pattern p = Pattern.compile(patternStr);
final Matcher m = p.matcher(page);
while (m.find()) {
System.out.printf("FOUND A MATCH: %s%n", matcher.group());
}
I edited your pattern slightly to make it more robust. This will print each match that it finds.

Related

Find ALL matches of a regex pattern in Java - even overlapping ones [duplicate]

This question already has answers here:
Matcher not finding overlapping words?
(4 answers)
Closed 4 years ago.
I have a String of the form:
1,2,3,4,5,6,7,8,...
I am trying to find all substrings in this string that contain exactly 4 digits. For this I have the regex [0-9],[0-9],[0-9],[0-9]. Unfortunately when I try to match the regex against my String, I never obtain all the substrings, only a part of all the possible substrings. For instance, in the example above I would only get:
1,2,3,4
5,6,7,8
although I expect to get:
1,2,3,4
2,3,4,5
3,4,5,6
...
How would I go about finding all matches corresponding to my regex?
for info, I am using Pattern and Matcher to find the matches:
Pattern pattern = Pattern.compile([0-9],[0-9],[0-9],[0-9]);
Matcher matcher = pattern.matcher(myString);
List<String> matches = new ArrayList<String>();
while (matcher.find())
{
matches.add(matcher.group());
}
By default, successive calls to Matcher.find() start at the end of the previous match.
To find from a specific location pass a start position parameter to find of one character past the start of the previous find.
In your case probably something like:
while (matcher.find(matcher.start()+1))
This works fine:
Pattern p = Pattern.compile("[0-9],[0-9],[0-9],[0-9]");
public void test(String[] args) throws Exception {
String test = "0,1,2,3,4,5,6,7,8,9";
Matcher m = p.matcher(test);
if(m.find()) {
do {
System.out.println(m.group());
} while(m.find(m.start()+1));
}
}
printing
0,1,2,3
1,2,3,4
...
If you are looking for a pure regex based solution then you may use this lookahead based regex for overlapping matches:
(?=((?:[0-9],){3}[0-9]))
Note that your matches are available in captured group #1
RegEx Demo
Code:
final String regex = "(?=((?:[0-9],){3}[0-9]))";
final String string = "0,1,2,3,4,5,6,7,8,9";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
Code Demo
output:
0,1,2,3
1,2,3,4
2,3,4,5
3,4,5,6
4,5,6,7
5,6,7,8
6,7,8,9
Some sample code without regex (since it seems not useful to me). Also I would assume regex to be slower in this case. Yet it will only work as it is as long as the numbers are only 1 character long.
String s = "a,b,c,d,e,f,g,h";
for (int i = 0; i < s.length() - 8; i+=2) {
System.out.println(s.substring(i, i + 7));
}
Ouput for this string:
a,b,c,d
b,c,d,e
c,d,e,f
d,e,f,g
As #OldCurmudgeon pointed out, find() by default start looking from the end of the previous match. To position it right after the first matched element, introduce the first matched region as a capturing group, and use it's end index:
Pattern pattern = Pattern.compile("(\\d,)\\d,\\d,\\d");
Matcher matcher = pattern.matcher("1,2,3,4,5,6,7,8,9");
List<String> matches = new ArrayList<>();
int start = 0;
while (matcher.find(start)) {
start = matcher.end(1);
matches.add(matcher.group());
}
System.out.println(matches);
results in
[1,2,3,4, 2,3,4,5, 3,4,5,6, 4,5,6,7, 5,6,7,8, 6,7,8,9]
This approach would also work if your matching region is longer than one digit

text wrongly matchs with sub string of words in group

I want to check the text to see if it starts with what or who and and is a question type, so for that I wrote the following code:
private static void startWithQOrIf(String commentstr){
String urlPattern = "(|who|what).*\\?.*$";
Pattern p = Pattern.compile(urlPattern,Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(commentstr);
if (m.find()) {
System.out.println("yes");
}
}
everything works good but for example when I try:
whooooooooo is the follower?
will match as well but should not because I am looking for who not whooooooooo
Any idea?
You can ensure a whole word using a word boundary \b:
(|who|what)\\b.*\\?.*$
^^
If the words in the alternation group are supposed to appear at the start of the string, you can just use matches and remove $ anchor:
String urlPattern = "(|who|what)\\b.*\\?.*";
Pattern p = Pattern.compile(urlPattern,Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(commentstr);
if (m.matches()) { // < - Here, matches is used
System.out.println("yes");
}
Note that (|who|what) matches either an empty string, or who, or what. If you do not plan to allow empty string, use just (who|what).
You must use word boundaries.
String urlPattern = "\\b(who|what)\\b.*\\?.*$";

Java: Need to extract a number from a string

I have a string containing a number. Something like "Incident #492 - The Title Description".
I need to extract the number from this string.
Tried
Pattern p = Pattern.compile("\\d+");
Matcher m = p.matcher(theString);
String substring =m.group();
By getting an error
java.lang.IllegalStateException: No match found
What am I doing wrong?
What is the correct expression?
I'm sorry for such a simple question, but I searched a lot and still not found how to do this (maybe because it's too late here...)
You are getting this exception because you need to call find() on the matcher before accessing groups:
Matcher m = p.matcher(theString);
while (m.find()) {
String substring =m.group();
System.out.println(substring);
}
Demo.
There are two things wrong here:
The pattern you're using is not the most ideal for your scenario, it's only checking if a string only contains numbers. Also, since it doesn't contain a group expression, a call to group() is equivalent to calling group(0), which returns the entire string.
You need to be certain that the matcher has a match before you go calling a group.
Let's start with the regex. Here's what it looks like now.
Debuggex Demo
That will only ever match a string that contains all numbers in it. What you care about is specifically the number in that string, so you want an expression that:
Doesn't care about what's in front of it
Doesn't care about what's after it
Only matches on one occurrence of numbers, and captures it in a group
To that, you'd use this expression:
.*?(\\d+).*
Debuggex Demo
The last part is to ensure that the matcher can find a match, and that it gets the correct group. That's accomplished by this:
if (m.matches()) {
String substring = m.group(1);
System.out.println(substring);
}
All together now:
Pattern p = Pattern.compile(".*?(\\d+).*");
final String theString = "Incident #492 - The Title Description";
Matcher m = p.matcher(theString);
if (m.matches()) {
String substring = m.group(1);
System.out.println(substring);
}
You need to invoke one of the Matcher methods, like find, matches or lookingAt to actually run the match.

Pattern matching with string containing dots

Pattern is:
private static Pattern r = Pattern.compile("(.*\\..*\\..*)\\..*");
String is:
sentVersion = "1.1.38.24.7";
I do:
Matcher m = r.matcher(sentVersion);
if (m.find()) {
guessedClientVersion = m.group(1);
}
I expect 1.1.38 but the pattern match fails. If I change to Pattern.compile("(.*\\..*\\..*)\\.*");
// notice I remove the "." before the last *
then 1.1.38.XXX fails
My goal is to find (x.x.x) in any incoming string.
Where am I wrong?
Problem is probably due to greedy-ness of your regex. Try this negation based regex pattern:
private static Pattern r = Pattern.compile("([^.]*\\.[^.]*\\.[^.]*)\\..*");
Online Demo: http://regex101.com/r/sJ5rD4
Make your .* matches reluctant with ?
Pattern r = Pattern.compile("(.*?\\..*?\\..*?)\\..*");
otherwise .* matches the whole String value.
See here: http://regex101.com/r/lM2lD5

How do I build a regex to match these `long` values?

How do I build a regular expression for a long data type in Java, I currently have a regex expression for 3 double values as my pattern:
String pattern = "(max=[0-9]+\\.?[0-9]*) *(total=[0-9]+\\.?[0-9]*) *(free=[0-9]+\\.?[0-9]*)";
I am constructing the pattern using the line:
Pattern a = Pattern.compile("control.avgo:", Pattern.CASE_INSENSITIVE);
I want to match the numbers following the equals signs in the example text below, from the file control.avgo.
max=259522560, total=39325696, free=17979640
What do I need to do to correct my code to match them?
Could it be that you actually need
Pattern a = Pattern.compile(pattern, Pattern.CASE_INSENSITIVE);
instead of
Pattern a = Pattern.compile("control.avgo:", Pattern.CASE_INSENSITIVE);
because your current code uses "control.avgo:" as the regex, and not the pattern you have defined.
You need to address several errors, including:
Your pattern specifies real numbers, but your question asks for long integers.
Your pattern omits the commas in the string being searched.
The first argument to Pattern.compile() is the regular expression, not the string being searched.
This will work:
String sPattern = "max=([0-9]+), total=([0-9]+), free=([0-9]+)";
Pattern pattern = Pattern.compile( sPattern, Pattern.CASE_INSENSITIVE );
String source = "control.avgo: max=259522560, total=39325696, free=17979640";
Matcher matcher = pattern.matcher( source );
if ( matcher.find()) {
System.out.println("max=" + matcher.group(1));
System.out.println("total=" + matcher.group(2));
System.out.println("free=" + matcher.group(3));
}
If you want to convert the numbers you find to a numeric type, use Long.valueOf( String ).
In case you only need to find any numerical preceded by "="...
String test = "3.control.avgo: max=259522560, total=39325696, free=17979640";
// looks for the "=" sign preceding any numerical sequence of any length
Pattern pattern = Pattern.compile("(?<=\\=)\\d+");
Matcher matcher = pattern.matcher(test);
// keeps on searching until cannot find anymore
while (matcher.find()) {
// prints out whatever found
System.out.println(matcher.group());
}
Output:
259522560
39325696
17979640

Categories