Pattern.COMMENTS always causing Matcher.find to fail

Pattern.COMMENTS always causing Matcher.find to fail - java

The following code matches the two expressions and prints success.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
String regex = "\\{user_id : [0-9]+\\}";
String string = "{user_id : 0}";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
if (matcher.find())
System.out.println("Success.");
else
System.out.println("Failure.");
}
}
However, I want white space to not matter, so the following should also print success.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
String regex = "\\{user_id:[0-9]+\\}";
String string = "{user_id : 0}";
Pattern pattern = Pattern.compile(regex, Pattern.COMMENTS);
Matcher matcher = pattern.matcher(string);
if (matcher.find())
System.out.println("Success.");
else
System.out.println("Failure.");
}
}
The Pattern.COMMENTS flag is supposed to permit white space, but it causes Failure to be printed. It even causes Failure to be printed if the strings are exactly equivalent including white space, like in the first example. For example,
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
String regex = "\\{user_id : [0-9]+\\}";
String string = "{user_id : 0}";
Pattern pattern = Pattern.compile(regex, Pattern.COMMENTS);
Matcher matcher = pattern.matcher(string);
if (matcher.find())
System.out.println("Success.");
else
System.out.println("Failure.");
}
}
Prints Failure.
Why is this happening and how do I make the Pattern ignore white space?

There is a misunderstanding on your side. Pattern.COMMENTS allow you to put additional whitespace into your regex, to improve the readability of the regex, but this whitespace will NOT be matched in the string.
This does not allow whitespace in your string, that is then matched automatically, without being defined in the regex.
Example
With Pattern.COMMENTS you can put whitespace in your regex like this
String regex = "\\{ user_id: [0-9]+ \\}";
to improve readablitiy, but the it will not match the string
String string = "{user_id : 0}";
because you haven't defined the whitespaces in the string, so if you want to use Pattern.COMMENTS then you need to treat whitespace you want to match specially, either you escape it
String regex = "\\{ user_id\\ :\\ [0-9]+ \\}";
or you use the whitespace class
String regex = "\\{ user_id \\s:\\s [0-9]+ \\}";

Related

How to build the regex to find particular word [duplicate]

This question already has answers here:
Regex to get the words after matching string
(6 answers)
Closed 2 years ago.
I need to find and print out a particular word in a String. What regex can you recommend me to find a "9.1.1_offline" in following String:
EGA_SAMPLE_APP-iOS-master-<Any word>-200710140849862
Another examples are:
EGA_SAMPLE_APP-iOS-master-9.2.3_online-200710140849862
EGA_SAMPLE_APP-iOS-master-10.2.3_offline-200710140849862

Use the regex, \\d+\\.\\d+\\.\\d+\\_(offline|online)
Demo:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
public static void main(String[] args) {
// Test strings
String[] arr = { "EGA_SAMPLE_APP-iOS-master-9.1.1_offline-200710140849862",
"EGA_SAMPLE_APP-iOS-master-9.2.3_online-200710140849862",
"EGA_SAMPLE_APP-iOS-master-10.2.3_offline-200710140849862" };
Pattern pattern = Pattern.compile("\\d+\\.\\d+\\.\\d+\\_(offline|online)");
// Print the matching string
for (String s : arr) {
Matcher matcher = pattern.matcher(s);
while (matcher.find()) {
System.out.println(matcher.group());
}
}
}
}
Output:
9.1.1_offline
9.2.3_online
10.2.3_offline
Explanation of the regex:
\\d+ specifies one or more digits
\\. specifies a .
\\_ specifies a _
(offline|online) specifies offline or online.
[Update]
Based on the edited question i.e. find anything between EGA_SAMPLE_APP-iOS-master- and -An_integer_number: Use the regex, EGA_SAMPLE_APP-iOS-master-(.*)-\\d+
Demo:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
public static void main(String[] args) {
// Test strings
String[] arr = { "EGA_SAMPLE_APP-iOS-master-9.1.1_offline-200710140849862",
"EGA_SAMPLE_APP-iOS-master-9.2.3_online-200710140849862",
"EGA_SAMPLE_APP-iOS-master-10.2.3_offline-200710140849862",
"EGA_SAMPLE_APP-iOS-master-anything here-200710140849862" };
// Define regex pattern
Pattern pattern = Pattern.compile("EGA_SAMPLE_APP-iOS-master-(.*)-\\d+");
// Print the matching string
for (String s : arr) {
Matcher matcher = pattern.matcher(s);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
}
}
}
Output:
9.1.1_offline
9.2.3_online
10.2.3_offline
anything here
Explanation of the regex:
.* specifies anything and the parenthesis around it specifies a capturing group which I've captured with group(1) in the code.

I can suggest the following one line option using String#replaceAll:
String input = "EGA_SAMPLE_APP-iOS-master-9.2.3_online-200710140849862";
String target = input.replaceAll(".*\\b(\\d+\\.\\d+\\.\\d+_(?:online|offline))\\b.*", "$1");
System.out.println(target);
This prints:
9.2.3_online

Looking for A Regular expression to match java regex (punct) pattern

I am looking for help/support for a Regex expression which will match studentIdMatch2 value in below class. studentIdMatch1 matches fine.However the studentIdMatch2 has studentId which can allow all the special characters other than : and ^ and comma.Hence its not working,thank you for your time and appreciate your support.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class TestRegEx {
public static void main(String args[]){
String studentIdMatch1 = "studentName:harry,^studentId:Id123";
String studentIdMatch2 = "studentName:harry,^studentId:Id-H/MPU/L&T/OA+_T/(1490)/17#)123";
Pattern pattern = Pattern
.compile("(\\p{Punct}?)(\\w+?)(:)(\\p{Punct}?)(\\w+?)(\\p{Punct}?),");
Matcher matcher = pattern.matcher(studentIdMatch1 + ","); // Works Fine(Matches Student Name and Id)
// No Special Characters in StudentId
//Matcher matcher = pattern.matcher(studentIdMatch2 + ","); //Wont work Special Characters in StudentId. Matches Student Name
while (matcher.find()) {
System.out.println("group1 = "+matcher.group(1)+ "group2 = "+matcher.group(2) +"group3 = "+matcher.group(3) +"group4 = "+matcher.group(4)+"group5 = "+matcher.group(5));
}
System.out.println("match ended");
}
}

You may try:
^SutdentName:(\w+),\^StudenId:([^\s,^:]+)$
Explanation of the above regex:
^, $ - Represents start and end of line respectively.
SutdentName: - Matches SutdentName: literally. Although according to me it should be StudentName; but I didn't changed it.
(\w+) - Represents first capturing group matching only word characters i.e. [A-Za-z0-9_] one or more times greedily.
,\^StudenId: - Matches ,^StudenId literally. Here also I guess it should be StudentId.
([^\s,^:]+) - Represents second capturing group matching everything other than white-space, ,, ^ and : one or more times greedily. You can add others according to your requirements.
You can find the demo of the above regex in here.
Sample Implementation in java:
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class Main
{
private static final Pattern pattern = Pattern.compile("^SutdentName:(\\w+),\\^StudenId:([^\\s,^:]+)$", Pattern.MULTILINE);
public static void main(String[] args) {
String string = "SutdentName:harry,^StudenId:Id123\n"
+ "SutdentName:harry,^StudenId:Id-H/MNK/U&T/BA+_T/(1490)/17#)123";
Matcher matcher = pattern.matcher(string);
while(matcher.find()){
System.out.println(matcher.group(1) + " " + matcher.group(2));
}
}
}
You can find the sample run of the above code in here.

The second (\\w+?) only captures words. So change it to capture what you want. i.e
allow all the special characters other than : and ^ and comma
like ([^:^,]+?)
^ - Negate the match
:^, - Matches : , ^ and comma

RegEx for capturing digits from a string

I have this String:
String filename = 20190516.BBARC.GLIND.statistics.xml;
How can I get the first part of the String (numbers) without the use of substrings.

Here, we might just want to collect our digits using a capturing group, and if we wish, we could later add more boundaries, maybe with an expression as simple as:
([0-9]+)
For instance, if our desired digits are at the start of our inputs, we might want to add a start char as a left boundary:
^([0-9]+)
Or if our digits are always followed by a ., we can bound it with that:
^([0-9]+)\.
and we can also add a uppercase letter after that to strengthen our right boundary and continue this process, if it might be necessary:
^([0-9]+)\.[A-Z]
RegEx
If this expression wasn't desired, it can be modified or changed in regex101.com.
RegEx Circuit
jex.im visualizes regular expressions:
Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "([0-9]+)";
final String string = "20190516.BBARC.GLIND.statistics.xml";
final String subst = "\\1";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
// The substituted value will be contained in the result variable
final String result = matcher.replaceAll(subst);
System.out.println("Substitution result: " + result);
Demo
const regex = /([0-9]+)(.*)/gm;
const str = `20190516.BBARC.GLIND.statistics.xml`;
const subst = `$1`;
// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);
console.log('Substitution result: ', result);

To extract a part or parts of string using regex I prefer to define groups.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class B {
public static void main(String[] args) {
String in="20190516.BBARC.GLIND.statistics.xml";
Pattern p=Pattern.compile("(\\w+).*");
Matcher m=p.matcher(in);
if(m.matches())
System.out.println(m.group(1));
else
System.out.println("no match");
}
}

Java split by white spaces with condition

I want to split string by white spaces. However if words are enclosed with quotation marks, then treat them as a single word.
For example Word to split. I will get word,to,split.
but if
"word to" split i should get "word to", split. quotation mark remains.

Is that what you want??
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class TmpTest {
public static void main(String args[]) {
final String regex = "\".*?\"|\\b\\w+\\b";
final String string = "\"word to\" split i should get \"word to2\", split.";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
}
}
}
demo

Here's how you can achieve this:
String str = "\"word to\" split";
List<String> list = new ArrayList<String>();
Matcher m = Pattern.compile("([^\"]\\S*|\".+?\")\\s*").matcher(str);
while (m.find())
list.add(m.group(1)); // Add .replace("\"", "") to remove surrounding quotes.
System.out.println(list);

Regex find() not true; detect duplicate characters in string

I'm trying to detect if there are any duplicate characters in a string using regex. When I test the pattern and input in an online regex tester, it says find() should be true. But it doesn't work in my program.
I'm using info from: regex to match a word with unique (non-repeating) characters.
What's happening? Am I using regex correctly in Java?
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
Pattern pat = Pattern.compile("(.).*\1");
String s = "1112";
Matcher m = pat.matcher(s);
if (m.find()) System.out.println("Matches");
else System.out.println("No Match");
}
}

The backreference needs to be escaped
Pattern pattern = Pattern.compile("(.).*\\1");

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Pattern.COMMENTS always causing Matcher.find to fail - java

Related

How to build the regex to find particular word [duplicate]

Looking for A Regular expression to match java regex (punct) pattern

RegEx for capturing digits from a string

Java split by white spaces with condition

Regex find() not true; detect duplicate characters in string

Categories

Resources