How to create multiple groups in a regex? - java

From the string I need a group of source i.e before TO and target i.e after TO. With that I need sub group from source i.e (3:8) in a single regex pattern.
MOVE A (3:8) TO B.

It's difficult to guess what might be desired here, I'd say maybe an expression similar to:
([^(]+(\(.+?\)))\s*TO\s*(.*)
Demo
Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "([^(]+(\\(.+?\\)))\\s*TO\\s*(.*)";
final String string = "MOVE A (3:8) TO B\n";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE | Pattern.CASE_INSENSITIVE);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}

Related

Java Regex for Unicode or special character

I have below statement in Javascript (NodeJS) -
const name = (name) =>
name && !XRegExp('^[\\p{L}\'\\d][ \\p{L}\'\\d-]*[\\p{L}\'-\'\\d]$').test(name)
? 'Invalid' : undefined
This regex is for name can accept . , - and (space) and should start with character.
How can I achieve same validation regex in java. I tried below -
#Pattern(regexp = "^(?U)[\\p{L}\\'\\d][ \\p{L}\\'\\d-]*[\\p{L}\\'-\\'\\d]$" ,
message="Invalid name")
String name;
I'm guessing that maybe this expression might work, based on the one you have provided:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "^[\\p{L}\\d'][ \\p{L}\\d'-]*[\\p{L}\\d'-]$";
final String string = "éééééé";
final Pattern pattern = Pattern.compile(regex, Pattern.UNICODE_CHARACTER_CLASS);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
DEMO

How to extract the alias from a table with regex?

I need to get the alias of a table in java, using for this regex.
For example, for this string: FROM lineitem A INNER JOIN orders B ON (B.O_ORDERKEY = A.L_ORDERKEY) INNER JOIN customer C ON (C.C_CUSTKEY = B.O_ORDERKEY) I should get A, B, C.
How can I do this using regex in java?
If all our inputs are similar to the question, we would start with a simple expression and test it later with more inputs:
\s+([A-Z])\s+
DEMO
Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "\\s+([A-Z])\\s+";
final String string = "FROM lineitem A INNER JOIN orders B ON (B.O_ORDERKEY = A.L_ORDERKEY) INNER JOIN customer C ON (C.C_CUSTKEY = B.O_ORDERKEY)";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
RegEx Circuit
jex.im visualizes regular expressions:

How to extract a number from string using regex in java

I have tried using
title.substring(title.lastIndexOf("(") + 1, title.indexOf(")"));
I only want to extract year like 1899.
It works well for string like "hadoop (1899)" but is throwing errors for string "hadoop(yarn)(1980)"
Simply replace all but the digits within parenthesis with a regex
String foo = "hadoop (1899)"; // or "hadoop(yarn)(1980)"
System.out.println(foo.replaceAll(".*\\((\\d+)\\).*", "$1"));
Hi check this example. This is regex for extracting numbers surrounded by brackets.
Here is usable code you can use:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "(?<=\\()\\d+(?=\\))";
final String string = "\"hadoop (1899)\" \"hadoop(yarn)(1980)\"";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}

Split encrypted text within brackets in java

I have encrypted text wraped with brackets, i'm trying to get only the text [|kXS6k~R5I~Q5gHR&f3gzJ[X] -->|kXS6k~R5I~Q5gHR&f3gzJ[X
Found this pattern [\[\](){}] , it works but split until first brackets or if there are parenthesesit will split the text untill them .
thanks
You can try this: "\[(.*?)\]". And don't forget to have the backslash escaped in your string otherwise it will give you error
String string = "[AA{R7QHQ8onQ~QXR7UXQzM\e{J6Y]";
String regex = "\\[(.*?)\\]";
String string = "[AA{R7QHQ8onQ~QXR7UXQzM\\e{J6Y]";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}

How to match nested repeating groups with regex in Java?

I'm trying to match repeating groups with Java:
String s = "The very first line\n"
+ "\n"
+ "AA (aa)\n"
+ "BB (bb)\n"
+ "CC (cc)\n"
+ "\n";
Pattern p = Pattern.compile(
"The very first line\\s+"
+ "((?<gr1>[a-z]+)\\s+\\((?<gr2>[^)]+)\\)\\s*)+",
Pattern.DOTALL | Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(s);
if (m.find()) {
for (int i = 0; i <= m.groupCount(); i++) {
System.out.println("group #" + i + ": [" + m.group(i).trim() + "]");
}
System.out.println("group gr1: [" + m.group("gr1").trim() + "]");
System.out.println("group gr2: [" + m.group("gr2").trim() + "]");
}
The problem is with the repeating groups: though the regex matches the whole text block (see group #0 in output example below), when retrieving groups #2 and #3 (or by name as well - gr1/gr2) it does return only the last match (CC/cc) and skips the previous ones (AA/aa and BB/bb)
group #0: [The very first line
AA (aa)
BB (bb)
CC (cc)]
group #1: [CC (cc)]
group #2: [CC]
group #3: [cc]
group gr1: [CC]
group gr2: [cc]
Is there a way to solve this?
edit: The very first line is in the pattern as identification string - see the comment to the gknicker's answer below
It seems like you wanted your pattern to match not the whole input string, but just the individual repeating sections. If that's true, your pattern would be:
Pattern p = Pattern.compile(
"((?<gr1>[a-z]+)\\s+\\((?<gr2>[^)]+)\\))",
Pattern.CASE_INSENSITIVE);
Then in this case you would have a while loop to find each match:
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println("group gr1: ["
+ m.group("gr1").trim() + "]");
System.out.println("group gr2: ["
+ m.group("gr2").trim() + "]");
}
But if you need the whole match, you'll probably have to use two patterns like this:
String s = "The very first line\n"
+ "\n"
+ "AA (aa)\n"
+ "BB (bb)\n"
+ "CC (cc)\n"
+ "\n";
Pattern p = Pattern.compile(
"The very first line\\s+(([a-z]+)\\s+\\(([^)]+)\\)\\s*)+",
Pattern.CASE_INSENSITIVE);
Pattern p2 = Pattern.compile(
"((?<gr1>[a-z]+)\\s+\\((?<gr2>[^)]+)\\))",
Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(s);
while (m.find()) {
Matcher m2 = p2.matcher(m.group());
while (m2.find()) {
System.out.println("group gr1: ["
+ m2.group("gr1").trim() + "]");
System.out.println("group gr2: ["
+ m2.group("gr2").trim() + "]");
}
}

Categories