Finding a Match using java.lang.String.matches() - java

I have a String that contains new line characters say...
str = "Hello\n"+"Batman,\n" + "Joker\n" + "here\n"
I would want to know how to find the existance of a particular word say .. Joker in the string str using java.lang.String.matches()
I find that str.matches(".*Joker.*") returns false and returns true if i remove the new line characters. So what would be the regex expression to be used as an argument to str.matches()?
One way is... str.replaceAll("\\n","").matches(.*Joker.*);

The problem is that the dot in .* does not match newlines by default. If you want newlines to be matched, your regex must have the flag Pattern.DOTALL.
If you want to embed that in a regex used in .matches() the regex would be:
"(?s).*Joker.*"
However, note that this will match Jokers too. A regex does not have the notion of words. Your regex would therefore really need to be:
"(?s).*\\bJoker\\b.*"
However, a regex does not need to match all its input text (which is what .matches() does, counterintuitively), only what is needed. Therefore, this solution is even better, and does not require Pattern.DOTALL:
Pattern p = Pattern.compile("\\bJoker\\b"); // \b is the word anchor
p.matcher(str).find(); // returns true

You can do something much simpler; this is a contains. You do not need the power of regex:
public static void main(String[] args) throws Exception {
final String str = "Hello\n" + "Batman,\n" + "Joker\n" + "here\n";
System.out.println(str.contains("Joker"));
}
Alternatively you can use a Pattern and find:
public static void main(String[] args) throws Exception {
final String str = "Hello\n" + "Batman,\n" + "Joker\n" + "here\n";
final Pattern p = Pattern.compile("Joker");
final Matcher m = p.matcher(str);
if (m.find()) {
System.out.println("Found match");
}
}

You want to use a Pattern that uses the DOTALL flag, which says that a dot should also match new lines.
String str = "Hello\n"+"Batman,\n" + "Joker\n" + "here\n";
Pattern regex = Pattern.compile("".*Joker.*", Pattern.DOTALL);
Matcher regexMatcher = regex.matcher(str);
if (regexMatcher.find()) {
// found a match
}
else
{
// no match
}

Related

text wrongly matchs with sub string of words in group

I want to check the text to see if it starts with what or who and and is a question type, so for that I wrote the following code:
private static void startWithQOrIf(String commentstr){
String urlPattern = "(|who|what).*\\?.*$";
Pattern p = Pattern.compile(urlPattern,Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(commentstr);
if (m.find()) {
System.out.println("yes");
}
}
everything works good but for example when I try:
whooooooooo is the follower?
will match as well but should not because I am looking for who not whooooooooo
Any idea?
You can ensure a whole word using a word boundary \b:
(|who|what)\\b.*\\?.*$
^^
If the words in the alternation group are supposed to appear at the start of the string, you can just use matches and remove $ anchor:
String urlPattern = "(|who|what)\\b.*\\?.*";
Pattern p = Pattern.compile(urlPattern,Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(commentstr);
if (m.matches()) { // < - Here, matches is used
System.out.println("yes");
}
Note that (|who|what) matches either an empty string, or who, or what. If you do not plan to allow empty string, use just (who|what).
You must use word boundaries.
String urlPattern = "\\b(who|what)\\b.*\\?.*$";

Print out the last match of a regex

I have this code:
String responseData = "http://xxxxx-f.frehd.net/i/world/open/20150426/1370235-005A/EPISOD-1370235-005A-016f1729028090bf_,892,144,252,360,540,1584,2700,.mp4.csmil/.m3u8";
"http://xxxxx-f.frehd.net/i/world/open/20150426/1370235-005A/EPISOD-1370235-005A-016f1729028090bf_,892,144,252,360,540,1584,2700,.mp4.csmil/.m3u8";
String pattern = ^(https://.*\.54325)$;
Pattern pr = Pattern.compile(pattern);
Matcher math = pr.matcher(responseData);
if (math.find()) {
// print the url
}
else {
System.out.println("No Math");
}
I want to print out the last string that starts with http and ends with .m3u8. How do I do this? I'm stuck. All help is appreciated.
The problem I have now is that when I find a math and what to print out the string, I get everything from responseData.
In case you need to get some substring at the end that is preceded by similar substrings, you need to make sure the regex engine has already consumed as many characters before your required match as possible.
Also, you have a ^ in your pattern that means beginning of a string. Thus, it starts matching from the very beginning.
You can achieve what you want with just lastIndexOf and substring:
System.out.println(str.substring(str.lastIndexOf("http://")));
Or, if you need a regex, you'll need to use
String pattern = ".*(http://.*?\\.m3u8)$";
and use math.group(1) to print the value.
Sample code:
import java.util.regex.*;
public class HelloWorld{
public static void main(String []args){
String str = "http://xxxxx-f.akamaihd.net/i/world/open/20150426/1370235-005A/EPISOD-1370235-005A-016f1729028090bf_,892,144,252,360,540,1584,2700,.mp4.csmil/index_0_av.m3u8" +
"EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=2795000,RESOLUTION=1280x720,CODECS=avc1.64001f, mp4a.40.2" +
"http://xxxxx-f.akamaihd.net/i/world/open/20150426/1370235-005A/EPISOD-1370235-005A-016f1729028090bf_,892,144,252,360,540,1584,2700,.mp4.csmil/index_6_av.m3u8";
String rx = ".*(http://.*?\\.m3u8)$";
Pattern ptrn = Pattern.compile(rx);
Matcher m = ptrn.matcher(str);
while (m.find()) {
System.out.println(m.group(1));
}
}
}
Output:
http://xxxxx-f.akamaihd.net/i/world/open/20150426/1370235-005A/EPISOD-1370235-005A-016f1729028090bf_,892,144,252,360,540,1584,2700,.mp4.csmil/index_6_av.m3u8
Also tested on RegexPlanet

pattern matching to detect special characters in a word

I am trying to identify any special characters ('?', '.', ',') at the end of a string in java. Here is what I wrote:
public static void main(String[] args) {
Pattern pattern = Pattern.compile("{.,?}$");
Matcher matcher = pattern.matcher("Sure?");
System.out.println("Input String matches regex - "+matcher.matches());
}
This returns a false when it's expected to be true. Please suggest.
Use "sure?".matches(".*[.,?]").
String#matches(...) anto-anchors the regex with ^ and $, no need to add them manually.
This is your code:
Pattern pattern = Pattern.compile("{.,?}$");
Matcher matcher = pattern.matcher("Sure?");
System.out.println("Input String matches regex - "+matcher.matches());
You have 2 problems:
You're using { and } instead of character class [ and ]
You're using Matcher#matches() instead of Matcher#find. matches method matches the full input line while find performs a search anywhere in the string.
Change your code to:
Pattern pattern = Pattern.compile("[.,?]$");
Matcher matcher = pattern.matcher("Sure?");
System.out.println("Input String matches regex - " + matcher.find());
Try this
Pattern pattern = Pattern.compile(".*[.,?]");
...

How to find the word with dot using regex in Java?

I am a new to Java. I want to search for a string in text file. Suppose the file contains:
Hi, I am learning Java.
I am using this below pattern to search through every exact word.
Pattern p = Pattern.compile("\\b"+search string+"\\b", Pattern.CASE_INSENSITIVE);
It works fine but it doesn't find "java." How to find both patterns. i.e with boundary symbols and with "." at end of the string. Does anyone have any ideas on how I can solve this problem?
You should parse your search string in order to change the dot . into a RegEx dot: \\.. Note that a single dot is a metacharacter in Regular Expressions and means any character. For example, you can replace all the dots in your String for \\.
If you don't want to do all that job, then just send java\\. as your search string
More info:
Using Regular Expressions in Java
Java Regex Tutorial
Java Regular Expressions
Code example:
public static void main(String[] args) {
String fileContent = "Hi i am learning java.";
String searchString = "java";
Pattern p = Pattern.compile(searchString);
Matcher m = p.matcher(fileContent );
while(m.find()) {
System.out.println(m.start() + " " + m.group());
}
}
It would print: 17 java
public static void main(String[] args) {
String fileContent = "Hi i am learning java.";
String searchString = "java\\.";
Pattern p = Pattern.compile(searchString);
Matcher m = p.matcher(fileContent );
while(m.find()) {
System.out.println(m.start() + " " + m.group());
}
}
It would print: 17 java. (note the dot in the end)
EDIT: As a very basic solution, since the only problem you have is with the dot, you can replace all the dots in your string with \\.
public static void main(String[] args) {
String fileContent = "Hi i am learning java.";
String searchString = "java.";
//this will do the trick even if the "searchString" doesn't contain a dot inside
searchString = searchString.replaceAll("\\.", "\\.");
Pattern p = Pattern.compile(searchString);
Matcher m = p.matcher(fileContent );
while(m.find()) {
System.out.println(m.start() + " " + m.group());
}
}
"\\b" + searchstring + "(?:\\.|\\b)"
If you want to stipulate that the dot must be followed by a non-word character or the end of the string, you could add a positive look-ahead
"\\b" + searchstring + "(?:\\.(?=\\W|$)|\\b)"
Pattern p = Pattern.compile(".*\\W*" + searchWord + "\\W*.*", Pattern.CASE_INSENSITIVE);
To be absolutely sure, the above says "find me a bit of text that starts with 0 or more characters, followed by 0 or more non-word characters specifically (\W* - the word boundary) followed by the search word, followed by the next word boundary followed by anything else".
This will caters for situations where the search word is at the beginning of the file, at the very end, or between punctuation eg: "hi,I am learning,java.".
Hope this helps...

In Java how do you replace all instances of a character except the first one?

In Java trying to find a regular expression that will match all instances of a specific character (:) except the first instance, want to replace all instances except first with nothing.
I can do this,
Pattern p = Pattern.compile(":");
Matcher m = p.matcher(input);
String output = m.replaceAll("");
and there is also m.replaceFirst() but I want to replace everything but first.
Naive approach:
String[] parts = str.split(":", 2);
str = parts[0] + ":" + parts[1].replaceAll(":", "");
For regex replace use match pattern \G((?!^).*?|[^:]*:.*?): and as replacement use first group $1
See and test the regex code in Perl here.
public static void main(String[] args) {
String name ="1_2_3_4_5";
int index = name.indexOf("_");
String name1 = name.substring(index+1);
name1 = name1.replace("_", "#");
System.out.println(name.substring(0,index+1)+ name1);
}
You can use reg ex
String str1 = "A:B:C:D:E:F:G:H:I:J:K:L:M";
str1= str1.replaceAll("([:|_].*?):", "$1_");
str1= str1.replaceAll("([:|_].*?):", "$1_");
Here I cant modify the regex to have output in first replace itself. Actually first replaceAll do replace ':' with '_' in alternate positions.
if (matcher.find()) {
String start = originalString.substring(0, matcher.end());
matcher.reset(originalString.substring(matcher.end(), originalString.length()));
replacedString = start + matcher.replaceAll("");
}

Categories