Regex isn't extracting specific part rather whole string upto the group - java

This is the follow up to the question that i asked here
The given regex is perfect i.e., (?:[^\/]*\/){4}([A-Za-z]{3}[0-9]{3}). However, when i do it in java, The java matches the string upto the matching group rather just giving me that string.
String defaultRegex = "(?:[^\\/]*\\/){4}([A-Za-z]{3}[0-9]{3})";
String stringToMatch = "unknown/relevant/nonrelevant:2.2.2/random/ABC123:random/morerandom";
Pattern p = Pattern.compile(defaultRegex);
Matcher m = p.matcher (stringToMatch);
if (m.find()){
System.out.println(m.group());
}
The above thing is printing unknown/relevant/nonrelevant:2.2.2/random/ABC123 when I want regex just to give me ABC123

matcher.group() as well as matcher.group(0) always return the whole matched string.
To get the first capturing group, use matcher.group(1),
The second capturing group goes with matcher.group(2), and so on.

Related

how to exclude "<" in regex match

I have a String which looks like "<name><address> and <Phone_1>". I have get to get the result like
1) <name>
2) <address>
3) <Phone_1>
I have tried using regex "<(.*)>" but it returns just one result.
The regex you want is
<([^<>]+?)><([^<>]+?)> and <([^<>]+?)>
Which will then spit out the stuff you want in the 3 capture groups. The full code would then look something like this:
Matcher m = Pattern.compile("<([^<>]+?)><([^<>]+?)> and <([^<>]+?)>").matcher(string);
if (m.find()) {
String name = m.group(1);
String address = m.group(2);
String phone = m.group(3);
}
The pattern .* in a regex is greedy. It will match as many characters as possible between the first < it finds and the last possible > it can find. In the case of your string it finds the first <, then looks for as much text as possible until a >, which it will find at the very end of the string.
You want a non-greedy or "lazy" pattern, which will match as few characters as possible. Simply <(.+?)>. The question mark is the syntax for non-greedy. See also this question.
This will work if you have dynamic number of groups.
Pattern p = Pattern.compile("(<\\w+>)");
Matcher m = p.matcher("<name><address> and <Phone_1>");
while (m.find()) {
System.out.println(m.group());
}

Java: Need to extract a number from a string

I have a string containing a number. Something like "Incident #492 - The Title Description".
I need to extract the number from this string.
Tried
Pattern p = Pattern.compile("\\d+");
Matcher m = p.matcher(theString);
String substring =m.group();
By getting an error
java.lang.IllegalStateException: No match found
What am I doing wrong?
What is the correct expression?
I'm sorry for such a simple question, but I searched a lot and still not found how to do this (maybe because it's too late here...)
You are getting this exception because you need to call find() on the matcher before accessing groups:
Matcher m = p.matcher(theString);
while (m.find()) {
String substring =m.group();
System.out.println(substring);
}
Demo.
There are two things wrong here:
The pattern you're using is not the most ideal for your scenario, it's only checking if a string only contains numbers. Also, since it doesn't contain a group expression, a call to group() is equivalent to calling group(0), which returns the entire string.
You need to be certain that the matcher has a match before you go calling a group.
Let's start with the regex. Here's what it looks like now.
Debuggex Demo
That will only ever match a string that contains all numbers in it. What you care about is specifically the number in that string, so you want an expression that:
Doesn't care about what's in front of it
Doesn't care about what's after it
Only matches on one occurrence of numbers, and captures it in a group
To that, you'd use this expression:
.*?(\\d+).*
Debuggex Demo
The last part is to ensure that the matcher can find a match, and that it gets the correct group. That's accomplished by this:
if (m.matches()) {
String substring = m.group(1);
System.out.println(substring);
}
All together now:
Pattern p = Pattern.compile(".*?(\\d+).*");
final String theString = "Incident #492 - The Title Description";
Matcher m = p.matcher(theString);
if (m.matches()) {
String substring = m.group(1);
System.out.println(substring);
}
You need to invoke one of the Matcher methods, like find, matches or lookingAt to actually run the match.

regular expression text between two sign

I have a text and I want to replace variables in it with proper values and my variables located between two #. When I use [/(?m)#.*?#/] to get these texts it also returns texts before and after first and last #. how could I get texts only between these two # sign. thanks in advance.
I use String.split("") method in Java.
for example I want use on the following String:
this is #the best# possible way #t#o do result!!!
and I wanna get these two results:
the best
t
In Java you can use this regex to grab value between first and second #:
String repl = input.replaceFirst("(?m)^[^#]*#([^#]*)#.*$" "$1");
To grab value between first and last #:
String repl = input.replaceFirst("(?m)^[^#]*#(.*?)#[^#]*$" "$1");
To find multiple matches use Pattern, Matcher:
Pattern p = Pattern.compile("#([^#]*)#"):
Matcher m = p.matcher(p);
while (m.find()) {
System.out.prinln(m.group(1));
}
RegEx Demo
Split() is the wrong tool to use here, use the Matcher() method to do this instead.
String s = "this is #the best# possible way #t#o do result!!!";
Pattern p = Pattern.compile("#([^#]*)#");
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println(m.group(1));
}
Output
the best
t

Find pattern in string with regex -> how to improve my solution

i would like to parse a string and get the "stringIAmLookingFor"-part of it, which is surrounded by "\_" at the end and the beginning. I'm using a regex to match that and then remove the "\_" in the found string. This is working, but I'm wondering if there is a more elegant approach to this problem?
String test = "xyz_stringIAmLookingFor_zxy";
Pattern p = Pattern.compile("_(\\w)*_");
Matcher m = p.matcher(test);
while (m.find()) { // find next match
String match = m.group();
match = match.replaceAll("_", "");
System.out.println(match);
}
Solution (partial)
Please also check the next section. Don't just read the solution here.
Just modify your code a bit:
String test = "xyz_stringIAmLookingFor_zxy";
// Make the capturing group capture the text in between (\w*)
// A capturing group is enclosed in (pattern), denoting the part of the
// pattern whose text you want to get separately from the main match.
// Note that there is also non-capturing group (?:pattern), whose text
// you don't need to capture.
Pattern p = Pattern.compile("_(\\w*)_");
Matcher m = p.matcher(test);
while (m.find()) { // find next match
// The text is in the capturing group numbered 1
// The numbering is by counting the number of opening
// parentheses that makes up a capturing group, until
// the group that you are interested in.
String match = m.group(1);
System.out.println(match);
}
Matcher.group(), without any argument will return the text matched by the whole regex pattern. Matcher.group(int group) will return the text matched by capturing group with the specified group number.
If you are using Java 7, you can make use of named capturing group, which makes the code slightly more readable. The string matched by the capturing group can be accessed with Matcher.group(String name).
String test = "xyz_stringIAmLookingFor_zxy";
// (?<name>pattern) is similar to (pattern), just that you attach
// a name to it
// specialText is not a really good name, please use a more meaningful
// name in your actual code
Pattern p = Pattern.compile("_(?<specialText>\\w*)_");
Matcher m = p.matcher(test);
while (m.find()) { // find next match
// Access the text captured by the named capturing group
// using Matcher.group(String name)
String match = m.group("specialText");
System.out.println(match);
}
Problem in pattern
Note that \w also matches _. The pattern you have is ambiguous, and I don't know what your expected output is for the cases where there are more than 2 _ in the string. And do you want to allow underscore _ to be part of the output?
You can define the group you actually want, since you're already using parentheses. You just need to tweak your pattern a bit.
String test = "xyz_stringIAmLookingFor_zxy";
Pattern p = Pattern.compile("_(\\w*)_");
Matcher m = p.matcher(test);
while (m.find()) { // find next match
System.out.println(m.group(1));
}
Use group(1) instead of group() because group() will get you the entire pattern and not the matching group.
Reference : http://docs.oracle.com/javase/7/docs/api/java/util/regex/Matcher.html#group(int)
"xyz_stringIAmLookingFor_zxy".replaceAll("_(\\w)*_", "$1");
will replace everything by this group in parenthesis
a simpler regex, no group needed:
"(?<=_)[^_]*"
if you want it more strict:
"(?<=_)[^_]+(?=_)"
try
String s = "xyz_stringIAmLookingFor_zxy".replaceAll(".*_(\\w*)_.*", "$1");
System.out.println(s);
output
stringIAmLookingFor

regex needed which matches for two sample string

I have two input strings :
this-is-a-sample-string-%7b3DES%7dFPvKTjGHUA3lD9Us70rfjQ==?Id=113690_2&Index=0&Referrer=IC
this-is-a-sample-string-%7b3DES%7dFPvKTjGHUA3lD9Us70rfjQ==
What I want is only the %7b3DES%7dFPvKTjGHUA3lD9Us70rfjQ== from both of the sample strings.
I tried by using the regex [a-zA-Z-]+-(.*) which works fine for the second input string.
String inputString = "this-is-a-sample-string-%7b3DES%7dFPvKTjGHUA3lD9Us70rfjQ==";
String regexString = "[a-zA-Z-]+-(.*)";
Pattern pattern = Pattern.compile(regexString);
Matcher matcher = pattern.matcher(inputString);
if(matcher.matches()) {
System.out.println("--->" + matcher.group(1) + "<---");
} else {
System.out.println("nope");
}
The following patterns match the desired group with the limited information and examples provided:
-([^-?]*)(?:\?|$)
.*-(.*?)(?:\?|$)
The first will match a hyphen then group all the characters up to either the ? or the end of the string.
The second matches as many characters and hyphens as possible followed by the smallest string to either the next question mark or the end of the string.
There are dozens of ways of writing something that will match this text though so I'm kinda just guessing if this is what you wanted. If this is not what you're after please elaborate on what exactly you're trying to accomplish.

Categories