Java Regex for Unicode or special character - java

I have below statement in Javascript (NodeJS) -
const name = (name) =>
name && !XRegExp('^[\\p{L}\'\\d][ \\p{L}\'\\d-]*[\\p{L}\'-\'\\d]$').test(name)
? 'Invalid' : undefined
This regex is for name can accept . , - and (space) and should start with character.
How can I achieve same validation regex in java. I tried below -
#Pattern(regexp = "^(?U)[\\p{L}\\'\\d][ \\p{L}\\'\\d-]*[\\p{L}\\'-\\'\\d]$" ,
message="Invalid name")
String name;

I'm guessing that maybe this expression might work, based on the one you have provided:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "^[\\p{L}\\d'][ \\p{L}\\d'-]*[\\p{L}\\d'-]$";
final String string = "éééééé";
final Pattern pattern = Pattern.compile(regex, Pattern.UNICODE_CHARACTER_CLASS);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
DEMO

Related

How to create multiple groups in a regex?

From the string I need a group of source i.e before TO and target i.e after TO. With that I need sub group from source i.e (3:8) in a single regex pattern.
MOVE A (3:8) TO B.
It's difficult to guess what might be desired here, I'd say maybe an expression similar to:
([^(]+(\(.+?\)))\s*TO\s*(.*)
Demo
Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "([^(]+(\\(.+?\\)))\\s*TO\\s*(.*)";
final String string = "MOVE A (3:8) TO B\n";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE | Pattern.CASE_INSENSITIVE);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}

How to extract the alias from a table with regex?

I need to get the alias of a table in java, using for this regex.
For example, for this string: FROM lineitem A INNER JOIN orders B ON (B.O_ORDERKEY = A.L_ORDERKEY) INNER JOIN customer C ON (C.C_CUSTKEY = B.O_ORDERKEY) I should get A, B, C.
How can I do this using regex in java?
If all our inputs are similar to the question, we would start with a simple expression and test it later with more inputs:
\s+([A-Z])\s+
DEMO
Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "\\s+([A-Z])\\s+";
final String string = "FROM lineitem A INNER JOIN orders B ON (B.O_ORDERKEY = A.L_ORDERKEY) INNER JOIN customer C ON (C.C_CUSTKEY = B.O_ORDERKEY)";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
RegEx Circuit
jex.im visualizes regular expressions:

parse filename using string [java]

What regex pattern do I need to parse a filename like this: "Ab12_Cd9023-2000-12-04-No234.nekiRtt3434GGG", where the parsed elements are: "Ab12_Cd9023"(name), "2000"(year), "12"(month), "04"(day), "234"(number), "nekiRtt3434GGG"(suffix). The sequence is always the same: name-yyyy-MM-dd-NoNN.suffix.
I want to use the pattern + matcher objects to solve that.
This is the most nice looking solution that I found:
private static final Pattern PATTERN = Pattern.compile("^(?<name>\\w+)-"
+ "(?<year>\\d{4})-"
+ "(?<month>\\d{2})-"
+ "(?<day>\\d{2})-"
+ "No(?<number>\\d+)."
+ "(?<suffix>\\w+)$");
Matcher m = PATTERN.matcher(file.getName());
if(!m.matches())
//some code if the pattern doesnt match
//this is how you acces the parsed strings:
m.group("year")
This regex should do the trick:
([a-bA-B0-9_])-([0-9]{4})-([0-9]{2})-([04]{2})-No(.+)\.(.+)$
If you use this as pattern, each of the () signifies one part of the string you want to capture.
This could work:
private static final Pattern PATTERN = Pattern.compile("^(.+)-([0-9]{4})-([0-9]{2})-([0-9]{2})-No(.+)\.(.+)$");
...
Matcher matcher = PATTERN.matcher(string);
if (matcher.matches()) {
String name = matcher.group(1);
int year = Integer.parseInt(matcher.group(2));
int month = Integer.parseInt(matcher.group(3));
int day = Integer.parseInt(matcher.group(4));
String number = matcher.group(5);
String suffix = matcher.group(6);
System.out.println("name: " + name);
System.out.println("year: " + year);
System.out.println("month: " + month);
System.out.println("day: " + day);
System.out.println("number: " + number);
System.out.println("suffix: " + suffix);
} else {
// error: does not match
}
If the sequence is always the same why not simply split it using - or . like this:
String filename = "Ab12_Cd9023-2000-12-04-No234.nekiRtt3434GGG";
String[] parts = filename.split("-|\\.");
for(String p : parts)
System.out.println(p);

How to extract a number from string using regex in java

I have tried using
title.substring(title.lastIndexOf("(") + 1, title.indexOf(")"));
I only want to extract year like 1899.
It works well for string like "hadoop (1899)" but is throwing errors for string "hadoop(yarn)(1980)"
Simply replace all but the digits within parenthesis with a regex
String foo = "hadoop (1899)"; // or "hadoop(yarn)(1980)"
System.out.println(foo.replaceAll(".*\\((\\d+)\\).*", "$1"));
Hi check this example. This is regex for extracting numbers surrounded by brackets.
Here is usable code you can use:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "(?<=\\()\\d+(?=\\))";
final String string = "\"hadoop (1899)\" \"hadoop(yarn)(1980)\"";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}

Split encrypted text within brackets in java

I have encrypted text wraped with brackets, i'm trying to get only the text [|kXS6k~R5I~Q5gHR&f3gzJ[X] -->|kXS6k~R5I~Q5gHR&f3gzJ[X
Found this pattern [\[\](){}] , it works but split until first brackets or if there are parenthesesit will split the text untill them .
thanks
You can try this: "\[(.*?)\]". And don't forget to have the backslash escaped in your string otherwise it will give you error
String string = "[AA{R7QHQ8onQ~QXR7UXQzM\e{J6Y]";
String regex = "\\[(.*?)\\]";
String string = "[AA{R7QHQ8onQ~QXR7UXQzM\\e{J6Y]";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}

Categories