I'm trying to match repeating groups with Java:
String s = "The very first line\n"
+ "\n"
+ "AA (aa)\n"
+ "BB (bb)\n"
+ "CC (cc)\n"
+ "\n";
Pattern p = Pattern.compile(
"The very first line\\s+"
+ "((?<gr1>[a-z]+)\\s+\\((?<gr2>[^)]+)\\)\\s*)+",
Pattern.DOTALL | Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(s);
if (m.find()) {
for (int i = 0; i <= m.groupCount(); i++) {
System.out.println("group #" + i + ": [" + m.group(i).trim() + "]");
}
System.out.println("group gr1: [" + m.group("gr1").trim() + "]");
System.out.println("group gr2: [" + m.group("gr2").trim() + "]");
}
The problem is with the repeating groups: though the regex matches the whole text block (see group #0 in output example below), when retrieving groups #2 and #3 (or by name as well - gr1/gr2) it does return only the last match (CC/cc) and skips the previous ones (AA/aa and BB/bb)
group #0: [The very first line
AA (aa)
BB (bb)
CC (cc)]
group #1: [CC (cc)]
group #2: [CC]
group #3: [cc]
group gr1: [CC]
group gr2: [cc]
Is there a way to solve this?
edit: The very first line is in the pattern as identification string - see the comment to the gknicker's answer below
It seems like you wanted your pattern to match not the whole input string, but just the individual repeating sections. If that's true, your pattern would be:
Pattern p = Pattern.compile(
"((?<gr1>[a-z]+)\\s+\\((?<gr2>[^)]+)\\))",
Pattern.CASE_INSENSITIVE);
Then in this case you would have a while loop to find each match:
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println("group gr1: ["
+ m.group("gr1").trim() + "]");
System.out.println("group gr2: ["
+ m.group("gr2").trim() + "]");
}
But if you need the whole match, you'll probably have to use two patterns like this:
String s = "The very first line\n"
+ "\n"
+ "AA (aa)\n"
+ "BB (bb)\n"
+ "CC (cc)\n"
+ "\n";
Pattern p = Pattern.compile(
"The very first line\\s+(([a-z]+)\\s+\\(([^)]+)\\)\\s*)+",
Pattern.CASE_INSENSITIVE);
Pattern p2 = Pattern.compile(
"((?<gr1>[a-z]+)\\s+\\((?<gr2>[^)]+)\\))",
Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(s);
while (m.find()) {
Matcher m2 = p2.matcher(m.group());
while (m2.find()) {
System.out.println("group gr1: ["
+ m2.group("gr1").trim() + "]");
System.out.println("group gr2: ["
+ m2.group("gr2").trim() + "]");
}
}
Related
From the string I need a group of source i.e before TO and target i.e after TO. With that I need sub group from source i.e (3:8) in a single regex pattern.
MOVE A (3:8) TO B.
It's difficult to guess what might be desired here, I'd say maybe an expression similar to:
([^(]+(\(.+?\)))\s*TO\s*(.*)
Demo
Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "([^(]+(\\(.+?\\)))\\s*TO\\s*(.*)";
final String string = "MOVE A (3:8) TO B\n";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE | Pattern.CASE_INSENSITIVE);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
I have encrypted text wraped with brackets, i'm trying to get only the text [|kXS6k~R5I~Q5gHR&f3gzJ[X] -->|kXS6k~R5I~Q5gHR&f3gzJ[X
Found this pattern [\[\](){}] , it works but split until first brackets or if there are parenthesesit will split the text untill them .
thanks
You can try this: "\[(.*?)\]". And don't forget to have the backslash escaped in your string otherwise it will give you error
String string = "[AA{R7QHQ8onQ~QXR7UXQzM\e{J6Y]";
String regex = "\\[(.*?)\\]";
String string = "[AA{R7QHQ8onQ~QXR7UXQzM\\e{J6Y]";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
I have written a regex to match following pattern:
Any characters followed by hyphen followed by number followed by space followed by an optional case insensitive keyword followed by space followed by any char.
E.g.,
TXT-234 #comment anychars
TXT-234 anychars
The regular expression I have written is as follows:
(?<issueKey>^((\\s*[a-zA-Z]+-\\d+)\\s+)+)((?i)?<keyWord>#comment)?\\s+(?<comment>.*)
But the above doesn't capture the zero occurrence of '#comment', even though I have specified the '?' for the regular expression. The case 2 in the above example always fails and the case 1 succeeds.
What am I doing wrong?
#comment won't match #keyword. That is why you don't have a match try. This one it should work:
([a-zA-Z]*-\\d*\\s(((?i)#comment|#transition|#keyword)+\\s)?[a-zA-Z]*)
This may help;
String str = "1. TXT-234 #comment anychars";
String str2 = "2. TXT-234 anychars";
String str3 = "3. TXT-2a34 anychars";
String str4 = "4. TXT.234 anychars";
Pattern pattern = Pattern.compile("([a-zA-Z]*-\\d*\\s(#[a-zA-Z]+\\s)?[a-zA-Z]*)");
Matcher m = pattern.matcher(str);
if (m.find()) {
System.out.println("Found value: " + m.group(0));
System.out.println("Found value: " + m.group(1));
System.out.println("Found value: " + m.group(2));
}
m = pattern.matcher(str2);
if (m.find()) {
System.out.println("Found value: " + m.group(0));
}
m = pattern.matcher(str3);
if (m.find()) {
System.out.println("Found value: " + m.group(0));
} else {
System.out.println("str3 not match");
}
m = pattern.matcher(str4);
if (m.find()) {
System.out.println("Found value: " + m.group(0));
} else {
System.out.println("str4 not match");
}
I have √(A&B)=|C| equation,
after split, I get this value
[√, (, A&B, ),=,|, C,|]
how can get value like this
[√(, A&B, ),=,|, C,|]
This my code,
return teks.split(""
+ "((?<=\\ )|(?=\\ ))|"
+ "((?<=\\!)|(?=\\!))|"
+ "((?<=\\√\\()|(?=\\√\\())|" //this is my problem
+ "((?<=\\√)|(?=\\√))|" //and this
+ "((?<=\\∛)|(?=\\∛))|"
+ "((?<=\\/)|(?=\\|))"
+ "((?<=\\&)|(?=\\&))"
+ "");
}
Try Matcher.find() for following regexp:
String s = "√(A&B)=|C|";
Matcher m = Pattern.compile("("
+ "(√\\()"
+ "|(\\))"
+ "|(\\w(\\&\\w)*)"
+ "|(=)"
+ "|(\\|)"
+ ")").matcher(s);
ArrayList<String> r = new ArrayList<>();
while(m.find())
r.add(m.group(1));
System.out.printf("%s", r.toString());
Result:
[√(, A&B, ), =, |, C, |]
Upd.
Or, if any symbol before parenthesis (except of "=") should be counted as one symbol with that "(":
String s = "√(A&(B&C))=(|C| & (! D))";
Matcher m = Pattern.compile("("
+ "[^\\s=]?\\(" // capture opening bracket with modifier (if any)
// you can replace it with "[√]?\\(", if only
// "√" symbol should go in conjunction with brace
+ "|\\)" // capture closing bracket
+ "|\\w" // capture identifiers
+ "|[=!\\&\\|]" // capture symbols "=", "!", "&" and "|"
+ ")").matcher(s.replaceAll("\\s", ""));
ArrayList<String> r = new ArrayList<>();
while(m.find())
r.add(m.group(1));
System.out.printf("%s -> %s\n", s, r.toString().replaceAll(", ", ",")); // ArrayList joins it's elements with ", ", so, removing extra space
Result:
√(A&(B&C))=(|C| & (! D)) -> [√(,A,&(,B,&,C,),),=,(,|,C,|,&(,!,D,),)]
I've been trying to replace this mathematical function x^2*sqrt(x^3) to this pow(x,2)*Math.sqrt(pow(x,3))
so this is the regex
/([0-9a-zA-Z\.\(\)]*)^([0-9a-zA-Z\.\(\)]*)/ pow(\1,\2)
it works in ruby, but I can't find a way to do it in java, I tried this method
String function= "x^2*sqrt(x^3)";
Pattern p = Pattern.compile("([a-z0-9]*)^([a-z0-9]*)");
Matcher m = p.matcher(function);
String out = function;
if(m.find())
{
System.out.println("GRUPO 0:" + m.group(0));
System.out.println("GRUPO 1:" + m.group(1));
out = m.replaceFirst("pow(" + m.group(0) + ", " + m.group(1) + ')');
}
String funcformat = out;
funcformat = funcformat.replaceAll("sqrt\\(([^)]*)\\)", "Math.sqrt($1)");
System.out.println("Return Value :"+ funcion );
System.out.print("Return Value :"+ funcformat );
But still doesn´t work, the output is: pow(x, )^2*Math.sqrt(x^3) as I said before it should be pow(x,2)*Math.sqrt(pow(x,3)).
Thank you!!
As others have commented, regex is not the way to go. You should use a parser. But if you want some quick and dirty:
From Matcher:
Capturing groups are indexed from left to right, starting at one.
Group zero denotes the entire pattern, so the expression m.group(0)
is equivalent to m.group().
So you need to use m.group(1) and m.group(2). And escape the caret ^ in your regex.
import java.util.regex.*;
public class Replace {
public static void main(String[] args) {
String function= "x^2*sqrt(3x)";
Pattern p = Pattern.compile("([a-z0-9]*)\\^([0-9]*)");
Matcher m = p.matcher(function);
String out = function;
if (m.find()) {
System.out.println("GRUPO 0:" + m.group(1));
System.out.println("GRUPO 1:" + m.group(2));
out = m.replaceFirst("pow(" + m.group(1) + ", " + m.group(2) + ')');
}
String funcformat = out;
funcformat = funcformat.replaceAll("sqrt\\(([a-z0-9]*)\\^([0-9]*)]*\\)", "Math.sqrt(pow($1, $2))");
System.out.println("Return Value :"+ function );
System.out.print("Return Value :"+ funcformat );
}
}