Remove part of String following regex match in Java - java

I want to remove a part of a string following what matches my regex.
I am trying to make a TV show organization program and I want to cut off anything in the name following the season and episode marker in the form SXXEXX where X is a digit.
I grasped the regex model fairly easily to create "[Ss]\d\d[Ee]\d\d" which should match properly.
I want to use the Matcher method end() to get the last index in the string of the match but it does not seem to be working as I think it should.
Pattern p = Pattern.compile("[Ss]\\d\\d[Ee]\\d\\d");
Matcher m = p.matcher(name);
if(m.matches())
return name.substring(0, m.end());
If someone could tell me why this doesn't work and suggest a proper way to do it, that would be great. Thanks.

matches() tries to match the whole string again the pattern. If you want to find your pattern within a string, use find(), find() will search for the next match in the string.
Your code could be quite the same:
if(m.find())
return name.substring(0, m.end());

matches matches the entire string, try find()
You could capture the name as well:
String name = "a movie S01E02 with some stuff";
Pattern p = Pattern.compile("(.*[Ss]\\d\\d[Ee]\\d\\d)");
Matcher m = p.matcher(name);
if (m.find())
System.out.println(m.group());
else
System.out.println("No match");
Will capture and print:
a movie S01E02

This should work
.*[Ss]\d\d[Ee]\d\d
In java (I'm rusty) this will be
String ResultString = null;
Pattern regex = Pattern.compile(".*[Ss]\\d\\d[Ee]\\d\\d");
Matcher regexMatcher = regex.matcher("Title S11E11Blah");
if (regexMatcher.find()) {
ResultString = regexMatcher.group();
}
Hope this helps

Related

Regular expression with [a-zA-Z]{0,2}

I am trying to a build a regular expression but it is not giving me the correct value
Bookss should match with the following :
Books
Bookss
Booksss
i.e the string can match with one character less or more or equal
I tried building the regular expression for the above case but it does not match
The regular expression i tried is:
String str="Books"
Pattern p=Pattern.compile(str.substring(0,input.length()-1)+"[a-zA-Z]{0,2}"
Matcher matcher = p.matcher(str);
if (matcher.find())
{
System.out.println("Found");
}
You can accomplish this with a regex such as
/Book.?.?.?\b/
such as:
String str="Books"
Pattern p=Pattern.compile(str.substring(0,input.length()-1)+".?.?.?\b";
Matcher matcher = p.matcher(str);
if (matcher.find())
{
System.out.println("Found");
}
The expression
.?
matches zero or one instances of any character, whereas
\b
limits it to the boundry of a word. (if the string input is to have muliple spaces in it, leave this off)
so,
.?.?.?\b
will match any three characters at the end of the word they are appended to.
Edit: missed the requirement of handling one character less or more.
Try following one..
Sample Code
String str="Booksss";
Pattern p=Pattern.compile(str.substring(0,str.length()-1)+".?{0,2}");
System.out.println(p);
Matcher matcher = p.matcher(str);
if (matcher.find())
{
System.out.println("Found");
}
}
}
Hope it will help you.
I found the solution. Thanks anyways everybody for answering
String input="Book.s";
Pattern p=Pattern.compile(input.substring(0,input.length()-1)+"[a-zA-Z]{0,2}$");
Matcher matcher = p.matcher("Books");
if (matcher.find())
{
System.out.println("Matches Regular Expression");
}

Java regex extract capture group if it exists

I apparently don't understand Java's regex library or regex either for that matter.
for this string:
String text = "asdf 2013-05-12 asdf";
this regex explodes in my face:
String REGEX_FORMAT_1 = ".+?([0-9]{4}\\s?-\\s?[0-9]{2}\\s?-\\s?[0-9]{2}).+";
Matcher matcher_1 = PATTERN_FORMAT_1.matcher(text);
if(matcher_1.matches()) {
String matchedGroup = matcher_1.group();
...
}
Semantically this makes sense to me but it seems I've totally misunderstood something. The regex works fine in some online regex editors like regex101 but not in others. Could someone please help me understand why I don't get the capture group containing 2013-05-12 ...
group() is equivalent to group(0) and returns the entire matched string. Use group(1) to pull out the first matched group.
String text = "asdf 2013-05-12 asdf";
String regex = ".+?([0-9]{4}\\s?-\\s?[0-9]{2}\\s?-\\s?[0-9]{2}).+";
Matcher matcher = Pattern.compile(regex).matcher(text);
if (matcher.matches()) {
String matchedGroup = matcher.group(1);
System.out.println(matchedGroup);
}
Output:
2013-05-12

Regular Expression to match a string that does not contain specific string in Java

I need a regular expression that matches a substring in string /*exa*/mple*/ ,
the matched string must be /*exa*/ not /*exa*/mple*/.
It also must not contain "*/" in it.
I have tried these regex:
"/\\*[.*&&[^*/]]\\*/" ,
"/\\*.*&&(?!^*/$)\\*/"
but im not able to get the exact solution.
I understand you want to pick out comments from a text.
Pattern p = Pattern.compile("/\\*.*?\\*/");
Matcher m = p.matcher("/*ex*a*/mple*/and/*more*/ther*/");
while (m.find()){
System.out.println(m.group());
}
you can try this:
/\*[^\*\/\*]+\*/ --> anything that is in between (including) "/*" and "*/"
Here is a sample:
Pattern p = Pattern.compile("/\\*[^\\*\\/\\*]+\\*/");
Matcher m = p.matcher("/*exa*/mple*/");
while (m.find()){
System.out.println(m.group());
}
OUTPUT:
/*exa*/

Java: Need to extract a number from a string

I have a string containing a number. Something like "Incident #492 - The Title Description".
I need to extract the number from this string.
Tried
Pattern p = Pattern.compile("\\d+");
Matcher m = p.matcher(theString);
String substring =m.group();
By getting an error
java.lang.IllegalStateException: No match found
What am I doing wrong?
What is the correct expression?
I'm sorry for such a simple question, but I searched a lot and still not found how to do this (maybe because it's too late here...)
You are getting this exception because you need to call find() on the matcher before accessing groups:
Matcher m = p.matcher(theString);
while (m.find()) {
String substring =m.group();
System.out.println(substring);
}
Demo.
There are two things wrong here:
The pattern you're using is not the most ideal for your scenario, it's only checking if a string only contains numbers. Also, since it doesn't contain a group expression, a call to group() is equivalent to calling group(0), which returns the entire string.
You need to be certain that the matcher has a match before you go calling a group.
Let's start with the regex. Here's what it looks like now.
Debuggex Demo
That will only ever match a string that contains all numbers in it. What you care about is specifically the number in that string, so you want an expression that:
Doesn't care about what's in front of it
Doesn't care about what's after it
Only matches on one occurrence of numbers, and captures it in a group
To that, you'd use this expression:
.*?(\\d+).*
Debuggex Demo
The last part is to ensure that the matcher can find a match, and that it gets the correct group. That's accomplished by this:
if (m.matches()) {
String substring = m.group(1);
System.out.println(substring);
}
All together now:
Pattern p = Pattern.compile(".*?(\\d+).*");
final String theString = "Incident #492 - The Title Description";
Matcher m = p.matcher(theString);
if (m.matches()) {
String substring = m.group(1);
System.out.println(substring);
}
You need to invoke one of the Matcher methods, like find, matches or lookingAt to actually run the match.

Regular expression not finding a match

I am trying to match the word Salvage in this string, but the code is not picking it up. Where am I going wrong?
//String to match
String titleString = "<td><i>Salvage</i></td>";
System.out.println(titleString);
//Template
String template = ">(.*)</a>";
//
Pattern p=Pattern.compile(template);
Matcher matcher = p.matcher(titleString);
System.out.println(matcher.group(1));
Try to put a matcher.find() just before the matcher.group(1).
The group takes the "Group from the last match". But as there was no match yet, you found nothing.

Categories