Pattern: how subtract matched character in character class?

Pattern: how subtract matched character in character class? - java

Is it possible to subtract a matched character in a character class?
Java docs are having examples about character classes with subtraction:
[a-z&&[^bc]] - a through z, except for b and c: [ad-z] (subtraction)
[a-z&&[^m-p]] - a through z, and not m through p: [a-lq-z](subtraction)
I want to write pattern, which matches two pairs of word characters, when pairs are not the same:
1) "aaaa123" - should NOT match
2) "aabb123" - should match "aabb" part
3) "aa--123" - should NOT match
I am close to success with following pattern:
([\w])\1([\w])\2
but of course it does not work in case 1, so I need to subtract the match of first group. But when I try to do this:
Pattern p = Pattern.compile("([\\w])\\1([\\w&&[^\\1]])\\2");
I am getting an exception:
Exception in thread "main" java.util.regex.PatternSyntaxException: Illegal/unsupported escape sequence near index 17
([\w])\1([\w&&[^\1]])\2
^
at java.util.regex.Pattern.error(Pattern.java:1713)
So seems it does not work with groups, but just with listing specific characters. Following pattern compiles with no problems:
Pattern p = Pattern.compile("([\\w])\\1([\\w&&[^a]])\\2");
Is there any other way to write such pattern?

Use
Pattern p = Pattern.compile("((\\w)\\2(?!\\2))((\\w)\\4)");
Your characters will be in groups 1 and 3.
This works by using a negative lookahead, to make sure the character following the second character in the first character group is a different character.

You are using the wrong tool for the job. By all means use a regex to detect pairs of character pairs, but you can just use != to test whether the characters within the pairs are the same. Seriously, there is no reason to do everything in a regular expression - it makes for unreadable, non-portable code and brings you no benefit other than "looking cool".

Try this
String regex = "(\\w)\\1(?!\\1)(\\w)\\2";
Pattern pattern = Pattern.compile(regex);
(?!\\1) is a negative lookahead, it ensures that the content of \\1 is not following
My test code
String s1 = "aaaa123";
String s2 = "aabb123";
String s3 = "aa--123";
String s4 = "123ccdd";
String[] s = { s1, s2, s3, s4 };
String regex = "(\\w)\\1(?!\\1)(\\w)\\2";
for(String a : s) {
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(a);
if (matcher.find())
System.out.println(a + " ==> Success");
else
System.out.println(a + " ==> Failure");
}
The output
aaaa123 ==> Failure
aabb123 ==> Success
aa--123 ==> Failure
123ccdd ==> Success

Related

Java non-greedy (?) regex to match string

String poolId = "something/something-else/pools[name='test'][scope='lan1']";
String statId = "something/something-else/pools[name='test'][scope='lan1']/stats[base-string='10.10.10.10']";
Pattern pattern = Pattern.compile(".+pools\\[name='.+'\\]\\[scope='.+'\\]$");
What regular expression should be used such that
pattern.matcher(poolId).matches()
returns true whereas
pattern.matcher(statsId).matches()
returns false?
Note that
something/something-else is irrelevant and can be of any length
Both name and scope can have ANY character including any of \, /, [, ] etc
stats[base-string='10.10.10.10'] is an example and there can be anything else after /
I tried to use the non-greedy ? like so .+pools\\[name='.+'\\]\\[scope='.+?'\\]$ but still both matches return true

You can use
.+pools\[name='[^']*'\]\[scope='[^']*'\]$
See the regex demo. Details:
.+ - any one or more chars other than line break chars as many as possible
pools\[name=' - a pools[name='string
[^']* - zero or more chars other than a '
'\]\[scope=' - a '][scope=' string
[^']* - zero or more chars other than a '
'\] - a '] substring
$ - end of string.
In Java:
Pattern pattern = Pattern.compile(".+pools\\[name='[^']*']\\[scope='[^']*']$");
See the Java demo:
//String s = "something/something-else/pools[name='test'][scope='lan1']"; // => Matched!
String s = "something/something-else/pools[name='test'][scope='lan1']/stats[base-string='10.10.10.10']";
Pattern pattern = Pattern.compile(".+pools\\[name='[^']*']\\[scope='[^']*']$");
Matcher matcher = pattern.matcher(s);
if (matcher.find()){
System.out.println("Matched!");
} else {
System.out.println("Not Matched!");
}
// => Not Matched!

Wiktor assumed that your values for name and scope cannot have single quotes in them. Thus the following:
.../pools[name='tes't']
would not match. This is really the only valid assumption to make, as if you can include unescaped single quotes, then what's to stop the value of scope from being (for example) the literal value lan1']/stats[base-string='10.10.10.10? The regex you included in your question has this issue. If you simply must have these values in your code, you need to escape them somehow. Try the following (edit of Wiktor's regex):
.+pools\[name='([^']|\\')*'\]\[scope='([^']|\\')*'\]$

java regex minimum character not working

^[a-zA-Z1-9][a-zA-Z1-9_\\.-]{2,64}[^\\.-]$
this is the regex that should match the following conditions
should start only with alphabets and numbers ,
contains alphabets numbers ,dot and hyphen
should not end with hyphen
it works for all conditions but when i try with three character like
vu6
111
aaa
after four characters validation is working properly did i miss anything

Reason why your Regex doesn't work:
Hope breaking it into smaller pieces will help:
^[a-zA-Z1-9][a-zA-Z1-9_\\.-]{2,64}[^\\.-]$
[a-zA-Z1-9]: Will match a single alphanumeric character ( except for _ )
[a-zA-Z1-9_\\.-]{2,64}: Will match alphanumeric character + "." + -
[^\\.-]: Will expect exactly 1 character which should not be "." or "-"
Solution:
You can use 2 simple regex:
This answer assumes that the length of the string you want to match lies between [3-65] (both inclusive)
First, that will actually validate the string
[a-zA-Z1-9][a-zA-Z1-9_\\.-]{2,64}
Second, that will check the char doesn't end with ".|-"
[^\\.-]$
In Java
Pattern pattern1 = Pattern.compile("^[a-zA-Z1-9][a-zA-Z1-9_\\.-]{2,64}$");
Pattern pattern2 = Pattern.compile("[^\\.-]$");
Matcher m1 = pattern1.matcher(input);
Matcher m2 = pattern1.matcher(input);
if(m1.find() && m2.find()) {
System.out.println("found");
}

What regex should I use to check a string only has numbers and 2 special characters ( - and , ) in Java?

Scenario: I want to check whether string contains only numbers and 2 predefined special characters, a dash and a comma.
My string contains numbers (0 to 9) and 2 special characters: a dash (-) defines a range and a comma (,) defines a sequence.
Tried attempt :
Tried following regex [0-9+-,]+, but not working as expected.
Possible inputs :
1-5
1,5
1-5,6
1,3,5-10
1-5,6-10
1,3,5-7,8,10
The regex should not accept these types of strings:
-----
1--4
,1,5
5,6,
5,4,-
5,6-
-5,6
Please can any one help me to create regex for above scenario?

You may use
^\d+(?:-\d+)?(?:,\d+(?:-\d+)?)*$
See the regex demo
Regex details:
^ - start of string
\d+ - 1 or more digits
(?:-\d+)? - an optional sequence of - and 1+ digits
(?:,\d+(?:-\d+)?)* - zero or more seuqences of:
, - a comma
\d+(?:-\d+)? - same pattern as described above
$ - end of string.

Change your regex [0-9+-,]+ to [0-9,-]+
final String patternStr = "[0-9,-]+";
final Pattern p = Pattern.compile(patternStr);
String data = "1,3,5-7,8,10";
final Matcher m = p.matcher(data);
if (m.matches()) {
System.out.println("SUCCESS");
}else{
System.out.println("ERROR");
}

Matcher cannot recognize the second group of regular expression in java

I've got a problem when I'm using Matcher for finding a symbol from the group of regular expressions, it cannot recognize the second group .Maybe the code below make it clear :
public void set(String n){
String pat = "(\\d+)[!##$%^&*()_+-=}]";
Pattern r;
r = Pattern.compile(pat);
System.out.println(r);
Matcher m;
m = r.matcher(n);
if (m.find()) {
JOptionPane.showMessageDialog(null,
"Not a correct form", "ERROR_NAME_MATCH", 0);
}else{
name = n;
}
}
After running the code the first group is recognizable but the second one [!##$%^&*()_+-=}] is not.I'm totally sure that the expression is true I've checked it with 'RegexBuddy'. There must be a problem with concatenating two or more groups in one line.
Thank you for your help.

Your regex - (\d+)[!##$%^&*()_+=}-] - matches a sequence of 1+ digits followed with a symbol from the specified set.
You want to test a string and return true if a single character from the specified set is present in the string.
So, just move \d to the character class and certainly move the - to the end of this class:
String pat = "[\\d!##$%^&*()_+=}-]";
^^^
If you need to match a digit or special char, use
String pat = "\\d|[!##$%^&*()_+=}-]";
If you need both irrespective of the order:
String pat = "^(?=\\D*\\d)(?=[^!##$%^&*()_+=}-]*[!##$%^&*()_+=}-])";

Java Regex to match repeated keywords

I need to filter a document if the caption is the same surname (i.e.,Smith Vs Smith or John Vs John etc.).
I am converting entire document into a string and validating that string against a regular expression.
Could any one help me to write a regular expression for the above case.

Backreferences.
Example: (\w+) Vs \1

If a had exactly understand your question: you have a string like this "X Vs Y" (Where X and Y are two names) and you want to know if X == Y.
In this case, a simple (\w+) regex can do it :
String input = "Smith Vs Smith";
// Build the Regex
Pattern p = Pattern.compile("(\\w+)");
Matcher m = p.matcher(input);
// Store the matches in a list
List<String> str = new ArrayList<String>();
while (m.find()) {
if (!m.group().equals("Vs"))
{
str.add(m.group());
}
}
// Test the matches
if (str.size()>1 && str.get(0).equals(str.get(1)))
System.out.println(" The Same ");
else System.out.println(" Not the Same ");

(\w+).*\1
This means: a word of 1 or more characters, signed as group 1, followed by anything, and followed by whatever group 1 is.
More explained: grouping (bracketing part of regex) and referencing to groups defined in the expression ( \1 does that here).
Example:
String s = "Stewie is a good guy. Stewie does no bad things";
s.find("(\\w+).*\\1") // will be true, and group 1 is the duplicated word. (note the additional java escape);

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Pattern: how subtract matched character in character class? - java

Use Pattern p = Pattern.compile("((\\w)\\2(?!\\2))((\\w)\\4)"); Your characters will be in groups 1 and 3. This works by using a negative lookahead, to make sure the character following the second character in the first character group is a different character.

Related

Java non-greedy (?) regex to match string

java regex minimum character not working

What regex should I use to check a string only has numbers and 2 special characters ( - and , ) in Java?

Matcher cannot recognize the second group of regular expression in java

Java Regex to match repeated keywords

Categories

Resources