Print out the string that matched my regular expression in java? - java

Possible duplicate: Print regex matches in java
I am using Matcher class in java to match a string with a particular regular expression which I converted into a Pattern using the Pattern class. I know my regex works because when I do Matcher.find(), I am getting true values where I am supposed to. But I want to print out the stings that are producing those true values (meaning print out the strings that match my regex) and I don't see a method in the matcher class to achieve that. Please do let me know if anyone has encountered such a problem before. I apologize as this question is fairly rudimentary but I am fairly new to regex and hence am still finding my way around the regex world.

Assuming mis your matcher:
m.group() will return the matched string.
[EDIT] Added info regarding matched groups
Also, if your regex has portions inside parenthesis, m.group(n) will return the string that matches the nth group inside parenthesis;
Pattern p = Pattern.compile("mary (.*) bob");
Matcher m = p.matcher("since that day mary loves bob");
m.group() returns "mary loves bob".
m.group(1) return "loves".

Related

Regex will quite if the first character does not match the pattern, even if there should be a match later [duplicate]

This question already has answers here:
Difference between matches() and find() in Java Regex
(5 answers)
Closed 3 years ago.
I am using regex in java, and I cannot create a regex to match what I want it to. I want to match everything in a string that begins and ends with a character.
"cats-are-cute" should match and return cats-are-cute
!!!DOG-CAT!!! should match and return DOG-CAT
I am using https://regexr.com/ to test, and it says my regex should work
I'm not even sure how I should attempt to fix this. I've found out that it will quite if the very first character does not match (e.i it is a special character) but it will match if the entire string begins + ends with a matching character.
It will not match if a special character begins or ends the entire string
Here is my code:
Pattern pattern = Pattern.compile("([A-Za-z0-9].*[A-Za-z0-9])");
Matcher matcher = pattern.matcher(word);
if(matcher.matches())
{
System.out.println("Matches");
System.out.println(matcher.start());
System.out.println(matcher.end());
}
if I type
testing
it returns
Matches
0
7
Small question: why is it 7 and not 6?
just like it should
but if I do "testing" matcher.matches() is false.
I think it should output
Matches
1
7
but sadly it does not as matcher.matches() returns false.
I think my regex is working, because quite a few sites have said that my regex will match what I want it to.
Am I missing something with Matcher matches()? Does it not do what I think it does?
I just needed to use find instead of matches, as OH GOD SPIDERS suggested in this comment:
As the documentation of Matcher.matches states it Attempts to match the entire region against the pattern.. You need to use Matcher.find if you don't want your entire String to be matched.

Regular Expression always returns false

I have a problem to get a regular expression to get work.
I use an XMLRPC Library to get information from an wiki.
so far so good.
After retrieving the data into a String Variable I would like to search through with a regular expression but the matcher will always return "false".
But if I asking the String ....contains("xyz"); the Answer is true.
The String looks something like this:
====== Datensicherheit ====== ''Kriterium von Sicherheit'' Typ: technisch Definition: \ //Allgemein.........
String regex = "Definition";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(text);
System.out.println(matcher.matches());
Does anybody know what I'm doing wrong?
This is an issue with your regex expression. If you are wanting to know if the string contains "Definition", your regex needs to be:
String regex = ".*Definition.*";
Note that matches() returns true if, and only if, the entire region sequence matches this matcher's pattern. see the java doc # https://docs.oracle.com/javase/7/docs/api/java/util/regex/Matcher.html#matches()
So, it will only be true if the entire "text" region matches "Definition", which is unlikely :).
Try find() instead which is true if, and only if, a subsequence of the input sequence starting at the given index matches this matcher's pattern.

regex to find matches in a multiline string in Java

I was trying use a regex to find some matches in a string in Java. The actual regex is
^(interface \X*!)
When i do it Java i use
^(interface \\X*!)
Now this throws Illegal/unsupported escape sequence near index 13. I searched the boards a little bit and found that it should actually be four backslashes to make it work. But if i use
^(interface \\\\X*!)
it returns no matches. Any pointers would be really helpful.
Just a sample match would be like
interface ABC
temp
abc
xyz
!
The \X construct comes from Perl, and the Javadoc for java.util.Pattern explicitly states in the section Comparison to Perl 5 that it is not supported.
In Java, you have to use a different construct. But this part is already answered in https://stackoverflow.com/a/39561579.
In order to match the pattern you identify in the comments, using Java, something like this should work:
Pattern p = Pattern.compile("interface[^!]*!", Pattern.DOTALL);
Matcher m = p.matcher("interface ABC\ntemp\nabc\nxyz\n!"); // your test string
if (m.matches()) {
//
}
This pattern matches any string beginning with "interface", followed by zero or more of any character except "!", followed by "!".
Pattern.DOTALL tells it that in addition to all other characters, "." should also match carriage returns and line feeds. See this for more info on DOTALL.

Whitespace in Java's regular expression

I'm trying to write a regular expression to mach an IRC PRIVMSG string. It is something like:
:nick!name#some.host.com PRIVMSG #channel :message body
So i wrote the following code:
Pattern pattern = Pattern.compile("^:.*\\sPRIVMSG\\s#.*\\s:");
Matcher matcher = pattern.matcher(msg);
if(matcher.matches()) {
System.out.println(msg);
}
It does not work. I got no matches. When I test the regular expression using online javascript testers, I got matches.
I tried to find the reason, why it doesn't work and I found that there's something wrong with the whitespace symbol. The following pattern will give me some matches:
Pattern.compile("^:.*");
But the pattern with \s will not:
Pattern.compile("^:.*\\s");
It's confusing.
The java matches method strikes again! That method only returns true if the entire string matches the input. You didn't include anything that captures the message body after the second colon, so the entire string is not a match. It works in testers because 'normal' regex is a 'match' if any part of the input matches.
Pattern pattern = Pattern.compile("^:.*?\\sPRIVMSG\\s#.*?\\s:.*$");
Should match
If you look at the documentation for matches(), uou will notice that it is trying to match the entire string. You need to fix your regexp or use find() to iterate through the substring matches.

How to find the exact word using a regex in Java?

Consider the following code snippet:
String input = "Print this";
System.out.println(input.matches("\\bthis\\b"));
Output
false
What could be possibly wrong with this approach? If it is wrong, then what is the right solution to find the exact word match?
PS: I have found a variety of similar questions here but none of them provide the solution I am looking for.
Thanks in advance.
When you use the matches() method, it is trying to match the entire input. In your example, the input "Print this" doesn't match the pattern because the word "Print" isn't matched.
So you need to add something to the regex to match the initial part of the string, e.g.
.*\\bthis\\b
And if you want to allow extra text at the end of the line too:
.*\\bthis\\b.*
Alternatively, use a Matcher object and use Matcher.find() to find matches within the input string:
Pattern p = Pattern.compile("\\bthis\\b");
Matcher m = p.matcher("Print this");
m.find();
System.out.println(m.group());
Output:
this
If you want to find multiple matches in a line, you can call find() and group() repeatedly to extract them all.
Full example method for matcher:
public static String REGEX_FIND_WORD="(?i).*?\\b%s\\b.*?";
public static boolean containsWord(String text, String word) {
String regex=String.format(REGEX_FIND_WORD, Pattern.quote(word));
return text.matches(regex);
}
Explain:
(?i) - ignorecase
.*? - allow (optionally) any characters before
\b - word boundary
%s - variable to be changed by String.format (quoted to avoid regex
errors)
\b - word boundary
.*? - allow (optionally) any characters after
For a good explanation, see: http://www.regular-expressions.info/java.html
myString.matches("regex") returns true or false depending whether the
string can be matched entirely by the regular expression. It is
important to remember that String.matches() only returns true if the
entire string can be matched. In other words: "regex" is applied as if
you had written "^regex$" with start and end of string anchors. This
is different from most other regex libraries, where the "quick match
test" method returns true if the regex can be matched anywhere in the
string. If myString is abc then myString.matches("bc") returns false.
bc matches abc, but ^bc$ (which is really being used here) does not.
This writes "true":
String input = "Print this";
System.out.println(input.matches(".*\\bthis\\b"));
You may use groups to find the exact word. Regex API specifies groups by parentheses. For example:
A(B(C))D
This statement consists of three groups, which are indexed from 0.
0th group - ABCD
1st group - BC
2nd group - C
So if you need to find some specific word, you may use two methods in Matcher class such as: find() to find statement specified by regex, and then get a String object specified by its group number:
String statement = "Hello, my beautiful world";
Pattern pattern = Pattern.compile("Hello, my (\\w+).*");
Matcher m = pattern.matcher(statement);
m.find();
System.out.println(m.group(1));
The above code result will be "beautiful"
Is your searchString going to be regular expression? if not simply use String.contains(CharSequence s)
System.out.println(input.matches(".*\\bthis$"));
Also works. Here the .* matches anything before the space and then this is matched to be word in the end.

Categories