Regular Expression not contain a word list - java

I'm trying to create a regular expression
to match a string which not contains some specific words and following it by a certain word like this:
(?<!(state|government|head).*)of
ex:
state of -> not match
government of -> not match
Abc of -> match
But It doesn't work. I don't know why, please help me explain it.

You can use this regex with Negative lookahead. The sample like:
public static void main(String[] args) {
Pattern pattern = Pattern.compile("^(?!state|government|head).*$");
String s = "state of";
Matcher matcher = pattern.matcher(s);
boolean bl = matcher.find();
System.out.println(bl);
s = "government of";
matcher = pattern.matcher(s);
bl = matcher.find();
System.out.println(bl);
s = "Abc of";
matcher = pattern.matcher(s);
bl = matcher.find();
System.out.println(bl);
}
Hope this help!

Related

Regex pattern not working properly on matcher

I have a string like this:
ben[0]='zc5u5';
icb[0]='M';
bild[0]='b1_413134.jpg';
ort[0]='Köln';kmm[0]=0.00074758603074103;alt[0]='18';
jti[0]=413134;
upd[0]='u41313486729.js';
jon[0]=0;
jco[0]=0;
jch[0]=0;
ben[1]='Oukg5';
icb[1]='M';
bild[1]='mannse.jpg';
jti[1]=412425;
upd[1]='u41242570092.js';
jon[1]=0;
jco[1]=0;
jch[1]=0;
ben[2]='Tester356';
icb[2]='M';
bild[2]='b1_247967.jpg';
I want to get the names fromben[], for example the first one would be zc5u5.
I do currently have this code:
Pattern pattern = Pattern.compile("(ben\\[\\d+\\]=').+?'");
Matcher matcher = pattern.matcher(string);
LinkedList<String> list = new LinkedList<String>();
// Loop through and find all matches and store them into the List
while(matcher.find()) {
list.add(matcher.group());
}
Unfortunately the pattern does match the whole line, instead of just the value, e.g. zc5u5. What am I doing wrong?
You need two groups if you want to capture the index and the value, and I would add support for optional white-space around the assignment (\\s*). Something like,
Pattern pattern = Pattern.compile("ben\\[(\\d+)\\]\\s*=\\s*'(.+)';");
Matcher matcher = pattern.matcher(string);
if (matcher.matches()) {
System.out.printf("index %s = %s%n", matcher.group(1), matcher.group(2));
}
You can use a regex like this:
ben\b.+'(.*?)'
Regex demo
Pattern pattern = Pattern.compile("ben\\b.*'(.*?)'");
Matcher matcher = pattern.matcher(string);
if (matcher.matches()) {
System.out.printf(matcher.group(1));
}

Java regular expression to validate and extract some values

I want to extract all three parts of the following string in Java
MS-1990-10
The first part should always be 2 letters (A-Z)
The second part should always be a year
The third part should always be a number
Does anyone know how can I do that using Java's regular expressions?
You can do this using java's pattern matcher and group syntax:
Pattern datePatt = Pattern.compile("([A-Z]{2})-(\\d{4})-(\\d{2})");
Matcher m = datePatt.matcher("MS-1990-10");
if (m.matches()) {
String g1 = m.group(1);
String g2 = m.group(2);
String g3 = m.group(3);
}
Use Matcher's group so you can get the patterns that actually matched.
In Matcher, the matches inside parenthesis will be captured and can be retrieved via the group() method. To use parenthesis without capturing the matches, use the non-capturing parenthesis (?:xxx).
See also Pattern.
public static void main(String[] args) throws Exception {
String[] lines = { "MS-1990-10", "AA-999-12332", "ZZ-001-000" };
for (String str : lines) {
System.out.println(Arrays.toString(parse(str)));
}
}
private static String[] parse(String str) {
String regex = "";
regex = regex + "([A-Z]{2})";
regex = regex + "[-]";
// regex = regex + "([^0][0-9]+)"; // any year, no leading zero
regex = regex + "([12]{1}[0-9]{3})"; // 1000 - 2999
regex = regex + "[-]";
regex = regex + "([0-9]+)";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(str);
if (!matcher.matches()) {
return null;
}
String[] tokens = new String[3];
tokens[0] = matcher.group(1);
tokens[1] = matcher.group(2);
tokens[2] = matcher.group(3);
return tokens;
}
This is a way to get all 3 parts with a regex:
public class Test {
public static void main(String... args) {
Pattern p = Pattern.compile("([A-Z]{2})-(\\d{4})-(\\d{2})");
Matcher m = p.matcher("MS-1990-10");
m.matches();
for (int i = 1; i <= m.groupCount(); i++)
System.out.println(m.group(i));
}
}
String rule = "^[A-Z]{2}-[1-9][0-9]{3}-[0-9]{2}";
Pattern pattern = Pattern.compile(rule);
Matcher matcher = pattern.matcher(s);
regular matches year between 1000 ~ 9999, u can update as u really need.

Storing backreferences from regex expression for later use

If I have,
String str = "11";
Pattern p = Pattern.compile("(\\d)\\1");
Matcher m = p.matcher(str);
How do I store use the result of \1 later? For example I want to do,
String str = "123123";
Pattern p = Pattern.compile("(\\d)\\1");
Matcher m = p.matcher(str);
String dependantString = //make this whatever was in group 1 of the pattern.
Is that possible?
You need to first call Matcher#find and then Matcher#group(1) like this:
String str = "123123";
Pattern p = Pattern.compile("(\\d+)\\1");
Matcher m = p.matcher(str);
if (m.find())
System.out.println( m.group(1) ); // 123
PS: Your regex also needed some correction to use \\d+ instead of \\d.

Java Pattern matching regex

I am doing a Pattern match the matcher.matches is coming as false, while the matcher.replaceAll actually finds the pattern and replaces it. Also the matcher.group(1) is returning an exception.
#Test
public void testname() throws Exception {
Pattern p = Pattern.compile("<DOCUMENT>(.*)</DOCUMENT>");
Matcher m = p.matcher("<RESPONSE><DOCUMENT>SDFS878SDF87DSF</DOCUMENT></RESPONSE>");
System.out.println("Why is this false=" + m.matches());
String s = m.replaceAll("HEY");
System.out.println("But replaceAll found it="+s);
}
I need the matcher.matches() to return true, and the matcher.group(1) to return
"<DOCUMENT>SDFS878SDF87DSF</DOCUMENT>"
Thanks in advance for the help.
final Pattern pattern = Pattern.compile("<DOCUMENT>(.+?)</DOCUMENT>");
final Matcher matcher = pattern.matcher("<RESPONSE><DOCUMENT>SDFS878SDF87DSF</DOCUMENT></RESPONSE>");
if (matcher.find())
{
System.out.println(matcher.group(1));
// code to replace and inject new value between the <DOCUMENT> tags
}

RegEx performence issue

I have written a regular expression to validate a name. The name can start with alphabetics and can be followed by alphabetics, numbers, a space or a _.
The regex that I wrote is:
private static final String REGEX = "([a-zA-Z][a-zA-Z0-9 _]*)*";
If the input is: "kasklfhklasdhklghjsdkgsjkdbgjsbdjKg;" the program gets stuck on matcher.matches().
Pattern pattern = Pattern.compile(REGEX);
Matcher matcher = pattern.matcher(input);
if (matcher.matches()) {
System.out.println("Pattern Matches");
} else {
System.out.println("Match Declined");
}
How can I optimize the regex?
Change your regex to:
private static final String REGEX = "[a-zA-Z][a-zA-Z0-9 _]*";
And it will match the String in a click.

Categories