How can I get the second matcher in regex in Java? [duplicate] - java

This question already has answers here:
Match at every second occurrence
(6 answers)
Closed 4 years ago.
I want to extract the second matcher in a regex pattern between - and _ in this string:
VA-123456-124_VRG.tif
I tried this:
Pattern mpattern = Pattern.compile("-.*?_");
But I get 123456-124 for the above regex in Java.
I need only 124.
How can I achieve this?

If you know that's your format, this will return the requested digits.
Everything before the underscore that is not a dash
Pattern pattern = Pattern.compile("([^\-]+)_");

I would use a formal pattern matcher here, to be a specific as possible. I would use this pattern:
^[^-]+-[^-]+-([^_]+).*
and then check the first capture group for the possible match. Here is a working code snippet:
String input = "A-123456-124_VRG.tif";
String pattern = "^[^-]+-[^-]+-([^_]+).*";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(input);
if (m.find()) {
System.out.println("Found value: " + m.group(1) );
}
124
Demo
By the way, there is a one liner which would also work here:
System.out.println(input.split("[_-]")[2]);
But, the caveat here is that it is not very specific, and might fail for your other data.

You know you want only digits so be more specific Pattern.compile("-([0-9]+)_");

Try using below regex:
.*-(.*?)_
What this will do is : .* will match all the characters till it finds - . Also, as it is greedy, it will try to find the last possible option, which is just before 24
Demo: https://regex101.com/r/NWgZoH/1
JShell Output:
jshell> Pattern pattern = Pattern.compile(".*-(.*?)_");
pattern ==> .*-(.*?)_
jshell> Matcher matcher = pattern.matcher("VA-123456-124_VRG.tif");
matcher ==> java.util.regex.Matcher[pattern=.*-(.*?)_ region=0,21 lastmatch=]
jshell> if(matcher.find()){
...> System.out.println(matcher.group(1));
...> }
124

Your test case are very low, but if I answer your test case I think below regex can be helpful.
-.*-(.*)_
then extract first group.

if you just want to extract in simple way go ahead with this,
public static void main(String[] args) {
String s = "VA-123456-124_VRG.tif";
System.out.println(s.split("[_-]")[2]);
}

Related

Find ALL matches of a regex pattern in Java - even overlapping ones [duplicate]

This question already has answers here:
Matcher not finding overlapping words?
(4 answers)
Closed 4 years ago.
I have a String of the form:
1,2,3,4,5,6,7,8,...
I am trying to find all substrings in this string that contain exactly 4 digits. For this I have the regex [0-9],[0-9],[0-9],[0-9]. Unfortunately when I try to match the regex against my String, I never obtain all the substrings, only a part of all the possible substrings. For instance, in the example above I would only get:
1,2,3,4
5,6,7,8
although I expect to get:
1,2,3,4
2,3,4,5
3,4,5,6
...
How would I go about finding all matches corresponding to my regex?
for info, I am using Pattern and Matcher to find the matches:
Pattern pattern = Pattern.compile([0-9],[0-9],[0-9],[0-9]);
Matcher matcher = pattern.matcher(myString);
List<String> matches = new ArrayList<String>();
while (matcher.find())
{
matches.add(matcher.group());
}
By default, successive calls to Matcher.find() start at the end of the previous match.
To find from a specific location pass a start position parameter to find of one character past the start of the previous find.
In your case probably something like:
while (matcher.find(matcher.start()+1))
This works fine:
Pattern p = Pattern.compile("[0-9],[0-9],[0-9],[0-9]");
public void test(String[] args) throws Exception {
String test = "0,1,2,3,4,5,6,7,8,9";
Matcher m = p.matcher(test);
if(m.find()) {
do {
System.out.println(m.group());
} while(m.find(m.start()+1));
}
}
printing
0,1,2,3
1,2,3,4
...
If you are looking for a pure regex based solution then you may use this lookahead based regex for overlapping matches:
(?=((?:[0-9],){3}[0-9]))
Note that your matches are available in captured group #1
RegEx Demo
Code:
final String regex = "(?=((?:[0-9],){3}[0-9]))";
final String string = "0,1,2,3,4,5,6,7,8,9";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
Code Demo
output:
0,1,2,3
1,2,3,4
2,3,4,5
3,4,5,6
4,5,6,7
5,6,7,8
6,7,8,9
Some sample code without regex (since it seems not useful to me). Also I would assume regex to be slower in this case. Yet it will only work as it is as long as the numbers are only 1 character long.
String s = "a,b,c,d,e,f,g,h";
for (int i = 0; i < s.length() - 8; i+=2) {
System.out.println(s.substring(i, i + 7));
}
Ouput for this string:
a,b,c,d
b,c,d,e
c,d,e,f
d,e,f,g
As #OldCurmudgeon pointed out, find() by default start looking from the end of the previous match. To position it right after the first matched element, introduce the first matched region as a capturing group, and use it's end index:
Pattern pattern = Pattern.compile("(\\d,)\\d,\\d,\\d");
Matcher matcher = pattern.matcher("1,2,3,4,5,6,7,8,9");
List<String> matches = new ArrayList<>();
int start = 0;
while (matcher.find(start)) {
start = matcher.end(1);
matches.add(matcher.group());
}
System.out.println(matches);
results in
[1,2,3,4, 2,3,4,5, 3,4,5,6, 4,5,6,7, 5,6,7,8, 6,7,8,9]
This approach would also work if your matching region is longer than one digit

java split by bracket and keep the delmiter - RegEx [duplicate]

This question already has answers here:
How do I split a string in Java?
(39 answers)
Closed 6 years ago.
i am trying to split the string using regex with closing bracket as a delimiter and have to keep the bracket..
i/p String: (GROUP=test1)(GROUP=test2)(GROUP=test3)(GROUP=test4)
needed o/p:
(GROUP=test1)
(GROUP=test2)
(GROUP=test3)
(GROUP=test4)
I am using the java regex - "\([^)]*?\)" and it is throwing me the error..Below is the code I am using and when I try to get the group, its throwing the error..
Pattern splitDelRegex = Pattern.compile("\\([^)]*?\\)");
Matcher regexMatcher = splitDelRegex.matcher("(GROUP=test1)(GROUP=test2) (GROUP=test3)(GROUP=test4)");
List<String> matcherList = new ArrayList<String>();
while(regexMatcher.find()){
String perm = regexMatcher.group(1);
matcherList.add(perm);
}
any help is appreciated..Thanks
You simply forgot to put capturing parentheses around the entire regex. You are not capturing anything at all. Just change the regex to
Pattern splitDelRegex = Pattern.compile("(\\([^)]*?\\))");
^ ^
I tested this in Eclipse and got your desired output.
You could use
str.split(")")
That would return an array of strings which you would know are lacking the closing parentheses and so could add them back in afterwards. Thats seems much easier and less error prone to me.
You could try changing this line :
String perm = regexMatcher.group(1);
To this :
String perm = regexMatcher.group();
So you read the last found group.
I'm not sure why you need to split the string at all. You can capture each of the bracketed groups with a regex.
Try this regex (\\([a-zA-Z0-9=]*\\)). I have a capturing group () that looks for text that starts with a literal \\(, contains [a-zA-Z0-9=] zero or many times * and ends with a literal \\). This is a pretty loose regex, you could tighten up the match if the text inside the brackets will be predictable.
String input = "(GROUP=test1)(GROUP=test2)(GROUP=test3)(GROUP=test4)";
String regex = "(\\([a-zA-Z0-9=]*\\))";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
while(matcher.find()) { // find the next match
System.out.println(matcher.group()); // print the match
}
Output:
(GROUP=test1)
(GROUP=test2)
(GROUP=test3)
(GROUP=test4)

regular expression text between two sign

I have a text and I want to replace variables in it with proper values and my variables located between two #. When I use [/(?m)#.*?#/] to get these texts it also returns texts before and after first and last #. how could I get texts only between these two # sign. thanks in advance.
I use String.split("") method in Java.
for example I want use on the following String:
this is #the best# possible way #t#o do result!!!
and I wanna get these two results:
the best
t
In Java you can use this regex to grab value between first and second #:
String repl = input.replaceFirst("(?m)^[^#]*#([^#]*)#.*$" "$1");
To grab value between first and last #:
String repl = input.replaceFirst("(?m)^[^#]*#(.*?)#[^#]*$" "$1");
To find multiple matches use Pattern, Matcher:
Pattern p = Pattern.compile("#([^#]*)#"):
Matcher m = p.matcher(p);
while (m.find()) {
System.out.prinln(m.group(1));
}
RegEx Demo
Split() is the wrong tool to use here, use the Matcher() method to do this instead.
String s = "this is #the best# possible way #t#o do result!!!";
Pattern p = Pattern.compile("#([^#]*)#");
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println(m.group(1));
}
Output
the best
t

java regex: extract text after delimeter?

i am new to regular expressions in Java. I like to extract a string by using regular expressions.
This is my String: "Hello,World"
I like to extract the text after ",". The result would be "World". I tried this:
final Pattern pattern = Pattern.compile(",(.+?)");
final Matcher matcher = pattern.matcher("Hello,World");
matcher.find();
But what would be the next step?
You don't need Regex for this. You can simply split on comma and get the 2nd element from the array: -
System.out.println("Hello,World".split(",")[1]);
OUTPUT: -
World
But if you want to use Regex, you need to remove ? from your Regex.
? after + is used for Reluctant matching. It will only match W and stop there.
You don't need that here. You need to match until it can match.
So use greedy matching instead.
Here's the code with modified Regex: -
final Pattern pattern = Pattern.compile(",(.+)");
final Matcher matcher = pattern.matcher("Hello,World");
if (matcher.find()) {
System.out.println(matcher.group(1));
}
OUTPUT: -
World
Extending what you have, you need to remove the ? sign from your pattern to use the greedy matching and then process the matched group:
final Pattern pattern = Pattern.compile(",(.+)"); // removed your '?'
final Matcher matcher = pattern.matcher("Hello,World");
while (matcher.find()) {
String result = matcher.group(1);
// work with result
}
Other answers suggest different approaches to your problem and might offer better solution for what you need.
System.out.println( "Hello,World".replaceAll(".*,(.*)","$1") ); // output is "World"
You are using a reluctant expression and will only select a single character W, whereas you can use a greedy one and print your matched group content:
final Pattern pattern = Pattern.compile(",(.+)");
final Matcher matcher = pattern.matcher("Hello,World");
if (matcher.find()) {
System.out.println(matcher.group(1));
}
Output:
World
See Regex Pattern doc

Regular Expression in Java: How to refer to "matched patterns"?

I was reading the Java Regular Expression tutorial, and it seems only to teach to test whether a pattern matched or not, but does not tell me how to refer to a matched pattern.
For example, I have a string "My name is xxxxx". And I want to print xxxx. How would I do that with Java regular expressions?
Thanks.
What tutorial were you reading ? The sun's one tackles that topic quite thoroughly, but you have to read it correctly :)
Capturing a part of a string is done through the parentheses. If you want to capture a group in a string, you have to put this part of the regular expression in parentheses. The groups are defined in the order the parentheses appear, and the group with index 0 represents the whole string.
For instance, the regexp "Day ([0-9]+) - Note ([0-9]+)" would define 3 groups :
group(0) : The whole string
group(1) : The first group in the regexp, that is to say the day number
group(2) : The second group in the regexp, that is to say the note number
As for the actual code and how to retrieve the groups you've defined in your regexp, have a look at the Java documentation, especially the Matcher class and its group method : http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/Matcher.html
You can test your regexps with that very useful tool : http://www.cis.upenn.edu/~matuszek/General/RegexTester/regex-tester.html
Hope this helped,
Cheers
Note the use of parentheses in the pattern and the group() method on Matcher
import java.util.regex.*;
public class Example {
static public void main(String[] args) {
Pattern regex = Pattern.compile("My name is (.*)");
String s = "My name is Michael";
Matcher matcher = regex.matcher(s);
if (matcher.matches()) {
System.out.println("original string: " + matcher.group(0));
System.out.println("first group: " + matcher.group(1));
}
}
}
Output is:
original string: My name is Michael
first group: Michael
You can use the Matcher group(int) method:
Pattern p = Pattern.compile("My name is (.*)");
Matcher m = p.matcher("My name is akf");
m.find();
String s = m.group(1); //grab the first group*
System.out.println(s);
output:
akf
* look at matching groups
Matcher m = Pattern.compile("name is (.*)").matcher("My name is Ross");
if (m.find()) {
System.out.println(m.group(0));
System.out.println(m.group(1));
}
The parens form a capturing group. Group 0 is the entire pattern and group 1 is the back reference.
The above program outputs:
name is Ross
Ross

Categories