I need to write a program that matches pattern with a line, that pattern may be a regular expression or normal pattern
Example:
if pattern is "tiger" then line that contains only "tiger" should match
if pattern is "^t" then lines that starts with "t" should match
I have done this with:
Blockquote Pattern and Matcher class
The problem is that when I use Matcher.find(), all regular expressions are matching but if I give full pattern then it is not matching.
If I use matches(), then only complete patterns are matching, not regular expressions.
My code:
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class MatchesLooking
{
private static final String REGEX = "^f";
private static final String INPUT =
"fooooooooooooooooo";
private static Pattern pattern;
private static Matcher matcher;
public static void main(String[] args)
{
// Initialize
pattern = Pattern.compile(REGEX);
matcher = pattern.matcher(INPUT);
System.out.println("Current REGEX is: "
+ REGEX);
System.out.println("Current INPUT is: "
+ INPUT);
System.out.println("find(): "
+ matcher.find());
System.out.println("matches(): "
+ matcher.matches());
}
}
matches given a regex of ^t would only match when the string only consists of a t.
You need to include the rest of the string as well for it to match. You can do so by appending .*, which means zero or more wildcards.
"^t.*"
Also, the ^ (and equivalently $) is optional when using matches.
I hope that helps, I'm not entirely clear on what you're struggling with. Feel free to clarify.
This is how Matcher works:
while (matcher.find()) {
System.out.println(matcher.group());
}
If you're sure there could be only one match in the input, then you could also use:
System.out.println("find(): " + matcher.find());
System.out.println("matches(): " + matcher.group());
Related
I am looking for help/support for a Regex expression which will match studentIdMatch2 value in below class. studentIdMatch1 matches fine.However the studentIdMatch2 has studentId which can allow all the special characters other than : and ^ and comma.Hence its not working,thank you for your time and appreciate your support.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class TestRegEx {
public static void main(String args[]){
String studentIdMatch1 = "studentName:harry,^studentId:Id123";
String studentIdMatch2 = "studentName:harry,^studentId:Id-H/MPU/L&T/OA+_T/(1490)/17#)123";
Pattern pattern = Pattern
.compile("(\\p{Punct}?)(\\w+?)(:)(\\p{Punct}?)(\\w+?)(\\p{Punct}?),");
Matcher matcher = pattern.matcher(studentIdMatch1 + ","); // Works Fine(Matches Student Name and Id)
// No Special Characters in StudentId
//Matcher matcher = pattern.matcher(studentIdMatch2 + ","); //Wont work Special Characters in StudentId. Matches Student Name
while (matcher.find()) {
System.out.println("group1 = "+matcher.group(1)+ "group2 = "+matcher.group(2) +"group3 = "+matcher.group(3) +"group4 = "+matcher.group(4)+"group5 = "+matcher.group(5));
}
System.out.println("match ended");
}
}
You may try:
^SutdentName:(\w+),\^StudenId:([^\s,^:]+)$
Explanation of the above regex:
^, $ - Represents start and end of line respectively.
SutdentName: - Matches SutdentName: literally. Although according to me it should be StudentName; but I didn't changed it.
(\w+) - Represents first capturing group matching only word characters i.e. [A-Za-z0-9_] one or more times greedily.
,\^StudenId: - Matches ,^StudenId literally. Here also I guess it should be StudentId.
([^\s,^:]+) - Represents second capturing group matching everything other than white-space, ,, ^ and : one or more times greedily. You can add others according to your requirements.
You can find the demo of the above regex in here.
Sample Implementation in java:
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class Main
{
private static final Pattern pattern = Pattern.compile("^SutdentName:(\\w+),\\^StudenId:([^\\s,^:]+)$", Pattern.MULTILINE);
public static void main(String[] args) {
String string = "SutdentName:harry,^StudenId:Id123\n"
+ "SutdentName:harry,^StudenId:Id-H/MNK/U&T/BA+_T/(1490)/17#)123";
Matcher matcher = pattern.matcher(string);
while(matcher.find()){
System.out.println(matcher.group(1) + " " + matcher.group(2));
}
}
}
You can find the sample run of the above code in here.
The second (\\w+?) only captures words. So change it to capture what you want. i.e
allow all the special characters other than : and ^ and comma
like ([^:^,]+?)
^ - Negate the match
:^, - Matches : , ^ and comma
I have this String:
String filename = 20190516.BBARC.GLIND.statistics.xml;
How can I get the first part of the String (numbers) without the use of substrings.
Here, we might just want to collect our digits using a capturing group, and if we wish, we could later add more boundaries, maybe with an expression as simple as:
([0-9]+)
For instance, if our desired digits are at the start of our inputs, we might want to add a start char as a left boundary:
^([0-9]+)
Or if our digits are always followed by a ., we can bound it with that:
^([0-9]+)\.
and we can also add a uppercase letter after that to strengthen our right boundary and continue this process, if it might be necessary:
^([0-9]+)\.[A-Z]
RegEx
If this expression wasn't desired, it can be modified or changed in regex101.com.
RegEx Circuit
jex.im visualizes regular expressions:
Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "([0-9]+)";
final String string = "20190516.BBARC.GLIND.statistics.xml";
final String subst = "\\1";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
// The substituted value will be contained in the result variable
final String result = matcher.replaceAll(subst);
System.out.println("Substitution result: " + result);
Demo
const regex = /([0-9]+)(.*)/gm;
const str = `20190516.BBARC.GLIND.statistics.xml`;
const subst = `$1`;
// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);
console.log('Substitution result: ', result);
To extract a part or parts of string using regex I prefer to define groups.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class B {
public static void main(String[] args) {
String in="20190516.BBARC.GLIND.statistics.xml";
Pattern p=Pattern.compile("(\\w+).*");
Matcher m=p.matcher(in);
if(m.matches())
System.out.println(m.group(1));
else
System.out.println("no match");
}
}
I have a very large String containing within it some markers like:
{codecitation class="brush: java; gutter: true;" width="700px"}
I'd need to collect all the markers contained in the long String. The difficulty I find in this task is that the markers all contain different parameter values. The only thing they have in common is the initial part that is:
{codecitation class="brush: [VARIABLE PART] }
Do you have any suggestion to collect all the markers in Java using a Regular Expression ?
Use pattern matching to find the markers as below. I hope this will help.
String xmlString = "{codecitation class=\"brush: java; gutter: true;\" width=\"700px\"}efasf{codecitation class=\"brush: java; gutter: true;\" width=\"700px\"}";
Pattern pattern = Pattern.compile("(\\{codecitation)([0-9 a-z A-Z \":;=]{0,})(\\})");
Matcher matcher = pattern.matcher(xmlString);
while (matcher.find()) {
System.out.println(matcher.group());
}
I guess you are particularly interested in the brush: java; and gutter: true; parts.
Maybe this snippet helps:
package test;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class CodecitationParserTest {
public static void main(String[] args) {
String testString = "{codecitation class=\"brush: java; gutter: true;\" width=\"700px\"}";
Pattern codecitationPattern = Pattern
.compile("\\{codecitation class=[\"]([^\"]*)[\"][^}]*\\}");
Matcher matcher = codecitationPattern.matcher(testString);
Pattern attributePattern = Pattern
.compile("\\s*([^:]*): ([^;]*);(.*)$");
Matcher attributeMatcher;
while (matcher.find()) {
System.out.println(matcher.group(1));
attributeMatcher = attributePattern.matcher(matcher.group(1));
while (attributeMatcher.find()) {
System.out.println(attributeMatcher.group(1) + "->"
+ attributeMatcher.group(2));
attributeMatcher = attributePattern.matcher(attributeMatcher
.group(3));
}
}
}
}
The codecitationPattern extracts the content of the class attribute of a codecitation element. The attributePattern extracts the first key and value and the rest, so you can apply it recursively.
Can someone help me out with a regex to match a string which starts with the following eg:
The string can begin with any html tag eg:
< span > or < p > etc so basically I want a regex to check if a string begins with any opening html tag <> and then followed by [apple videoID=
Eg:
<span>[apple videoID=
Here's what I've tried :
static String pattern = "^<[^>]+>[apple videoID=";
static Pattern pattern1 = Pattern.compile(pattern);
What is wrong in the above?
You have a typo in the following line.
static String pattern = "^<[^>]+>[apple videoID=";
This string is not a valid regular expression because you have an unclosed [ right before the word apple, hence the "Unclosed character class" PatternSyntaxException. You either meant to type
static String pattern = "^<[^>]+><apple videoID=";
assuming that apple is an html tag, or
static String pattern = "^<[^>]+>\\[apple videoID=";
if you really did want the [ in front of apple. This is because [ is a special character in regular expressions and must be escaped with a \ which is a special character in Java strings and must be escaped with a \. Therefore \\[.
simple as this:
<[.]+><apple videoID=[.]*
Try this pattern :
"^<[A-Za-z]+>\\[apple videoID=$"
This pattern will match [apple videoID=
Hope this will help you..!
Here is the solution
Pattern.CASE_INSENSITIVE helps to fetch the pattern either in upper case or lower case.
Tested and Executed.
package sireesh.yarlagadda;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Pattern {
public static void main(String[] args) {
String text="<span><apple videoID=";
String patternString = "<[a-zA-Z]*>\\<apple videoID=";
Pattern pattern = Pattern.compile(patternString, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(text);
System.out.println("lookingAt = " + matcher.lookingAt());
System.out.println("matches = " + matcher.matches());
}
}
I have a String that contains new line characters say...
str = "Hello\n"+"Batman,\n" + "Joker\n" + "here\n"
I would want to know how to find the existance of a particular word say .. Joker in the string str using java.lang.String.matches()
I find that str.matches(".*Joker.*") returns false and returns true if i remove the new line characters. So what would be the regex expression to be used as an argument to str.matches()?
One way is... str.replaceAll("\\n","").matches(.*Joker.*);
The problem is that the dot in .* does not match newlines by default. If you want newlines to be matched, your regex must have the flag Pattern.DOTALL.
If you want to embed that in a regex used in .matches() the regex would be:
"(?s).*Joker.*"
However, note that this will match Jokers too. A regex does not have the notion of words. Your regex would therefore really need to be:
"(?s).*\\bJoker\\b.*"
However, a regex does not need to match all its input text (which is what .matches() does, counterintuitively), only what is needed. Therefore, this solution is even better, and does not require Pattern.DOTALL:
Pattern p = Pattern.compile("\\bJoker\\b"); // \b is the word anchor
p.matcher(str).find(); // returns true
You can do something much simpler; this is a contains. You do not need the power of regex:
public static void main(String[] args) throws Exception {
final String str = "Hello\n" + "Batman,\n" + "Joker\n" + "here\n";
System.out.println(str.contains("Joker"));
}
Alternatively you can use a Pattern and find:
public static void main(String[] args) throws Exception {
final String str = "Hello\n" + "Batman,\n" + "Joker\n" + "here\n";
final Pattern p = Pattern.compile("Joker");
final Matcher m = p.matcher(str);
if (m.find()) {
System.out.println("Found match");
}
}
You want to use a Pattern that uses the DOTALL flag, which says that a dot should also match new lines.
String str = "Hello\n"+"Batman,\n" + "Joker\n" + "here\n";
Pattern regex = Pattern.compile("".*Joker.*", Pattern.DOTALL);
Matcher regexMatcher = regex.matcher(str);
if (regexMatcher.find()) {
// found a match
}
else
{
// no match
}