Storing backreferences from regex expression for later use - java

If I have,
String str = "11";
Pattern p = Pattern.compile("(\\d)\\1");
Matcher m = p.matcher(str);
How do I store use the result of \1 later? For example I want to do,
String str = "123123";
Pattern p = Pattern.compile("(\\d)\\1");
Matcher m = p.matcher(str);
String dependantString = //make this whatever was in group 1 of the pattern.
Is that possible?

You need to first call Matcher#find and then Matcher#group(1) like this:
String str = "123123";
Pattern p = Pattern.compile("(\\d+)\\1");
Matcher m = p.matcher(str);
if (m.find())
System.out.println( m.group(1) ); // 123
PS: Your regex also needed some correction to use \\d+ instead of \\d.

Related

Get Substring from a String in Java

I have the following text:
...,Niedersachsen,NOT IN CHARGE SINCE: 03.2009, CATEGORY:...,
Now I want to extract the date after NOT IN CHARGE SINCE: until the comma.
So i need only 03.2009 as result in my substring.
So how can I handle that?
String substr = "not in charge since:";
String before = s.substring(0, s.indexOf(substr));
String after = s.substring(s.indexOf(substr),s.lastIndexOf(","));
EDIT
for (String s : split) {
s = s.toLowerCase();
if (s.contains("ex peps")) {
String substr = "not in charge since:";
String before = s.substring(0, s.indexOf(substr));
String after = s.substring(s.indexOf(substr), s.lastIndexOf(","));
System.out.println(before);
System.out.println(after);
System.out.println("PEP!!!");
} else {
System.out.println("Line ok");
}
}
But that is not the result I want.
You can use Patterns for example :
String str = "Niedersachsen,NOT IN CHARGE SINCE: 03.2009, CATEGORY";
Pattern p = Pattern.compile("\\d{2}\\.\\d{4}");
Matcher m = p.matcher(str);
if (m.find()) {
System.out.println(m.group());
}
Output
03.2009
Note : if you want to get similar dates in all your String you can use while instead of if.
Edit
Or you can use :
String str = "Niedersachsen,NOT IN CHARGE SINCE: 03.03.2009, CATEGORY";
Pattern p = Pattern.compile("SINCE:(.*?)\\,");
Matcher m = p.matcher(str);
if (m.find()) {
System.out.println(m.group(1).trim());
}
You can use : to separate the String s.
String substr = "NOT IN CHARGE SINCE:";
String before = s.substring(0, s.indexOf(substr)+1);
String after = s.substring(s.indexOf(':')+1, s.lastIndexOf(','));
Of course, regular expressions give you more ways to do searching/matching, but assuming that the ":" is the key thing you are looking for (and it shows up exactly once in that position) then:
s.substring(s.indexOf(':')+1, s.lastIndexOf(',')).trim();
is the "most simple" and "least overhead" way of fetching that substring.
Hint: as you are searching for a single character, use a character as search pattern; not a string!
If you have a more generic usecase and you know the structure of the text to be matched well you might profit from using regular expressions:
Pattern pattern = Pattern.compile(".*NOT IN CHARGE SINCE: \([0-9.]*\),");
Matcher matcher = pattern.matcher(line);
System.out.println(matcher.group());
A more generic way to solve your problem is to use Regex to match Every group Between : and ,
Pattern pattern = Pattern.compile("(?<=:)(.*?)(?=,)");
Matcher m = p.matcher(str);
You have to create a pattern for it. Try this as a simple regex starting point, and feel free to improvise on it:
String s = "...,Niedersachsen,NOT IN CHARGE SINCE: 03.2009, CATEGORY:....,";
Pattern pattern = Pattern.compile(".*NOT IN CHARGE SINCE: ([\\d\\.]*).*");
Matcher matcher = pattern.matcher(s);
if (matcher.find())
{
System.out.println(matcher.group(1));
}
That should get you whatever group of digits you received as date.

How do I take a string with a named group and replace only that named capture group with a value in Java 7

Say for example I have the following string with a named capture group:
/this/(?<capture1>.*)/a/string/(?<capture2>.*)
And I want to replace the capture group with a value like "foo" so that I end up with a string that looks like:
/this/foo/a/string/bar
Limitations are:
Regex must be used as the string is evaluated elsewhere but it doesn't have to be a capture group.
I'd rather not have to regex match the regex.
EDIT: There can be many groups in the string.
You can find the starting and ending index
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
startindex= matcher.start();
stopindex=matcher.end();
// Your code for replacing that index and generating a new string with foo
// you can use string buffer to delete and insert the characters as you know the indexes
}
}
Full Implementation:
public static String getnewString(String text,String reg){
StringBuffer result = new StringBuffer(text);
Pattern pattern = Pattern.compile(reg);
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
int startindex= matcher.start();
int stopindex=matcher.end();
System.out.println(startindex+" "+stopindex);
result.delete(startindex, stopindex);
result.insert(startindex, "foo");
}
return result.toString();
}
Try this,
int lastIndex = s.lastIndexOf("/");
String newString = s.substring(0, lastIndex+1).concat("newString");
System.out.println(newString);
Get the subString till last '/' and then add new string to the substring like above
I got it:
String string = "/this/(?<capture1>.*)/a/string/(?<capture2>.*)";
Pattern pattern = Pattern.compile(string);
Matcher matcher = pattern.matches(string);
string.replace(matcher.group("capture1"), "value 1");
string.replace(matcher.group("capture2"), "value 2");
Crazy, but works.

Java regex matcher always returns false

I have a string expression from which I need to get some values. The string is as follows
#min({(((fields['example6'].value + fields['example5'].value) * ((fields['example1'].value*5)+fields['example2'].value+fields['example3'].value-fields['example4'].value)) * 0.15),15,9.087})
From this stribg, I need to obtain a string array list which contains the values such as "example1", "example2" and so on.
I have a Java method which looks like this:
String regex = "/fields\\[['\"]([\\w\\s]+)['\"]\\]/g";
ArrayList<String> arL = new ArrayList<String>();
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(expression);
while(m.find()){
arL.add(m.group());
}
But m.find() always returns false. Is there anything I'm missing?
The problem is with the '/'s. If what you want to extract is only the field name, you should use m.group(1):
String regex = "fields\\[['\"]([\\w\\s]+)['\"]\\]";
ArrayList<String> arL = new ArrayList<String>();
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(expression);
while(m.find()){
arL.add(m.group(1));
}
The main issue you seem to have is that you are using delimiters (as in PHP or Perl or JavaScript) that cannot be used in a Java regex. Also, you have your matches in the first capturing group, but you are using group() that returns the whole match (including fields[').
Here is a working code:
String str = "#min({(((fields['example6'].value + fields['example5'].value) * ((fields['example1'].value*5)+fields['example2'].value+fields['example3'].value-fields['example4'].value)) * 0.15),15,9.087})";
ArrayList<String> arL = new ArrayList<String>();
String rx = "(?<=fields\\[['\"])[\\w\\s]*(?=['\"]\\])";
Pattern ptrn = Pattern.compile(rx);
Matcher m = ptrn.matcher(str);
while (m.find()) {
arL.add(m.group());
}
Here is a working IDEONE demo
Note that I have added look-arounds to extract just the texts between 's with group().

Java regular expression to validate and extract some values

I want to extract all three parts of the following string in Java
MS-1990-10
The first part should always be 2 letters (A-Z)
The second part should always be a year
The third part should always be a number
Does anyone know how can I do that using Java's regular expressions?
You can do this using java's pattern matcher and group syntax:
Pattern datePatt = Pattern.compile("([A-Z]{2})-(\\d{4})-(\\d{2})");
Matcher m = datePatt.matcher("MS-1990-10");
if (m.matches()) {
String g1 = m.group(1);
String g2 = m.group(2);
String g3 = m.group(3);
}
Use Matcher's group so you can get the patterns that actually matched.
In Matcher, the matches inside parenthesis will be captured and can be retrieved via the group() method. To use parenthesis without capturing the matches, use the non-capturing parenthesis (?:xxx).
See also Pattern.
public static void main(String[] args) throws Exception {
String[] lines = { "MS-1990-10", "AA-999-12332", "ZZ-001-000" };
for (String str : lines) {
System.out.println(Arrays.toString(parse(str)));
}
}
private static String[] parse(String str) {
String regex = "";
regex = regex + "([A-Z]{2})";
regex = regex + "[-]";
// regex = regex + "([^0][0-9]+)"; // any year, no leading zero
regex = regex + "([12]{1}[0-9]{3})"; // 1000 - 2999
regex = regex + "[-]";
regex = regex + "([0-9]+)";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(str);
if (!matcher.matches()) {
return null;
}
String[] tokens = new String[3];
tokens[0] = matcher.group(1);
tokens[1] = matcher.group(2);
tokens[2] = matcher.group(3);
return tokens;
}
This is a way to get all 3 parts with a regex:
public class Test {
public static void main(String... args) {
Pattern p = Pattern.compile("([A-Z]{2})-(\\d{4})-(\\d{2})");
Matcher m = p.matcher("MS-1990-10");
m.matches();
for (int i = 1; i <= m.groupCount(); i++)
System.out.println(m.group(i));
}
}
String rule = "^[A-Z]{2}-[1-9][0-9]{3}-[0-9]{2}";
Pattern pattern = Pattern.compile(rule);
Matcher matcher = pattern.matcher(s);
regular matches year between 1000 ~ 9999, u can update as u really need.

Java Regular Expressions

I am trying to write something like this:
Pattern p = Pattern.compile("Mar\\w");
Matcher m = p.matcher("Mary");
String result = m.replaceAll("\\w");
The result would ideally be "y". Any ideas?
Your question is not so clear, but I think you want to use a lookahead:
Pattern p = Pattern.compile("Mar(?=\\w)");
Matcher m = p.matcher("Mary");
String result = m.replaceAll("");
See it online: ideone
Or you could use a capturing group:
Pattern p = Pattern.compile("Mar(\\w)");
Matcher m = p.matcher("Mary");
String result = m.replaceAll("$1");
See it online: ideone

Categories