Using Pattern and Matcher to search for special characters (Example: $) - java

Apologies if this has already been answered.
I am using the following code to search for a substring:
String subject = "ABC"
String subString = "AB"
Pattern pattern = Pattern.compile(subString, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(subject);
while (matcher.find()){
//Matched
}
But when my subject string contains a $ in the beginning, it does not work since it is a special character.
String subject = "$ABC"
String subString = "$"
How does one handle that?

By escaping the special character in the subString. Like,
String subString = "\\$";
or telling the Pattern to match literals. Like,
Pattern pattern = Pattern.compile(subString, Pattern.LITERAL | Pattern.CASE_INSENSITIVE);

There are few meta characters in regex. And some of them which are supported by regex in java are
( ) [ ] { { \ ^ $ | ? * + . < > - = !
So $ is a indeed meta character here. The meta character conveys special meaning to the regex engine and hence can't be use literally. So in order to use them you have to combine them with escape character which is backslash \
So String subject = "\\$ABC"
String subString = "\\$"
would do. Java uses double backslash instead of single for escape character unlike the other regex engine.

Related

Regex to match the beginning and the end of a string in Java

I want to extract a certain like of string using Regex in Java. I currently have this pattern:
pattern = "^\\a.+\\sed$\n";
Supposed to match on a string that starts with "a" and ends with "sed". This is not working. Did I miss something ?
Removed the \n line at the end of the pattern and replaced it with a "$":
Still doesn't get a match. The regex looks legit from my side.
What I want to extract is the "a sed" from the temp string.
String temp = "afsgdhgd gfgshfdgadh a sed afdsgdhgdsfgdfagdfhh";
pattern = "(?s)^a.*sed$";
pr = Pattern.compile(pattern);
math = pr.matcher(temp);
UPDATE
You want to match a sed, so you can use a\\s+sed if there is only whitespace between a and sed:
String s = "afsgdhgd gfgshfdgadh a sed afdsgdhgdsfgdfagdfhh";
Pattern pattern = Pattern.compile("a\\s+sed");
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
System.out.println(matcher.group(0));
}
See IDEONE demo
Now, if there can be anything between a and sed, use a tempered greedy token:
Pattern pattern = Pattern.compile("(?s)a(?:(?!a|sed).)*sed");
^^^^^^^^^^^^^
See another IDEONE demo.
ORIGINAL ANSWER
The main problem with your regex is the \n at the end. $ is the end of string, and you try to match one more character after a string end, which is impossible. Also, \\s matches a whitespace symbol, but you need a literal s.
You need to remove \\s and \n and make . match a newline, and also it is advisbale to use * quantifier to allow 0 symbols in-between:
pattern = "(?s)^a.*sed$";
See the regex demo
The regex matches:
^ - start of string
a - a literal a
.* - 0 or more any characters (since (?s) modifier makes a . match any character including a newline)
sed - a literal letter sequence sed
$ - end of string
Your temp string cannot match the pattern (?s)^a.*sed$, because this pattern says that your temp string must begin with the character a and end with the sequence sed, which is not the case. Your string has trailing characters after the "sed" sequence.
If you only want to extract that a...sed portion of the whole string, try using the unanchored pattern "a.*sed" and use the find() method of the Matcher class:
Pattern pattern = Pattern.compile("a.*sed");
Matcher m = pattern.matcher(temp);
if (m.find())
{
System.out.println("Found string "+m.group());
System.out.println("From "+m.start()+" to "+m.end());
}

Regular expression java to extract the balance from a string

I have a String which contains " Dear user BAL= 1,234/ ".
I want to extract 1,234 from the String using the regular expression. It can be 1,23, 1,2345, 5,213 or 500
final Pattern p=Pattern.compile("((BAL)=*(\\s{1}\\w+))");
final Matcherm m = p.matcher(text);
if(m.find())
return m.group(3);
else
return "";
This returns 3.
What regular expression should I make? I am new to regular expressions.
You search in your regex for word characters \w+ but you should search for digits with \d+.
Additionally there is the comma, so you need to match that as well.
I'd use
/.BAL=\s([\d,]+(?=/)./
as pattern and get only the number in the resulting group.
Explanation:
.* match anything before
BAL= match the string "BAL="
\s match a whitespace
( start matching group
[\d,]+ matches every digit or comma one ore more times
(?=/) match the former only if followed by a slash
) end matching group
.* matches anything thereaft
This is untestet, but it should work like this:
final Pattern p=Pattern.compile(".*BAL=\\s([\\d,]+(?=/)).*");
final Matcherm m = p.matcher(text);
if(m.find())
return m.group(1);
else
return "";
According to an online tester, the pattern above matches the text:
BAL= 1,234/
If it didn't have to be extracted by the regular expression you could simply do:
// split on any whitespace into a 4-element array
String[] foo = text.split("\\s+");
return foo[3];

How to create a java regular expression pattern that would match a string only at certain positon?

I would like to create a regular expression pattern that would succeed in matching only if the pattern string not followed by any other string in the test string or input string ! Here is what i tried :
Pattern p = Pattern.compile("google.com");//I want to know the right format
String input1 = "mail.google.com";
String input2 = "mail.google.com.co.uk";
Matcher m1 = p.matcher(input1);
Matcher m2 = p.matcher(input2);
boolean found1 = m1.find();
boolean found2 = m2.find();//This should be false because "google.com" is followed by ".co.uk" in input2 string
Any help would be appreciated!
Your pattern should be google\.com$. The $ character matches the end of a line. Read about regex boundary matchers for details.
Here is how to match and get the non-matching part as well.
Here is the raw regex pattern as an interactive link to a great regular expression tool
^(.*)google\.com$
^ - match beginning of string
(.*) - capture everything in a group up to the next match
google - matches google literal
\. - matches the . literal has to be escaped with \
com - matches com literal
$ - matches end of string
Note: In Java the \ in the String literal has to be escaped as well! ^(.*)google\\.com$
You should use google\.com$. $ character matches the end of a line.
Pattern p = Pattern.compile("google\\.com$");//I want to know the right format
String input2 = "mail.google.com.co.uk";
Matcher m2 = p.matcher(input2);
boolean found2 = m2.find();
System.out.println(found2);
Output = false
Pattern p = Pattern.compile("google\.com$");
The dollar sign means it has to occur at the end of the line/string being tested. Note too that your dot will match any character, so if you want it to match a dot only, you need to escape it.

Java regex validating \[]

My issues is to allow only a-z, A-Z, 0-9, points, dashes and underscores and [] for a given string.
Here is my code but not working so far.
[a-zA-Z0-9._-]* this one works ok for validating a-z, A-Z, 0-9 points, dashes and underscores and but when it comes to add and [] i got error Illegal character.
[a-zA-Z0-9._-\\[]]*
it's obviously that [] broke the regex.
Any suggestion how to handle this proble?
String REGEX = "[a-zA-Z0-9._-\\[]]*";
String username = "dsfsdf_12";
Pattern pattern = Pattern.compile(REGEX);
Matcher matcher = pattern.matcher(username);
if (matcher.matches()) {
System.out.println("matched");
} else {
System.out.println("NOT matched");
}
You have to escape both [] as shown below:
"[a-zA-Z0-9._-\\[\\]]*"
Try escaping both brackets and the minus sign :
String REGEX = "[a-zA-Z0-9._\\-\\[\\]]*";
Edit after your comment for "/" and "\" :
allow / :
String REGEX = "[a-zA-Z0-9._\\-\\[\\]/]*";
allow \ :
String REGEX = "[a-zA-Z0-9._\\-\\[\\]\\\\]*";
allow / and \ :
String REGEX = "[a-zA-Z0-9._\\-\\[\\]/\\\\]*";
You need to escape both brackets, not just the left bracket:
String REGEX = "[a-zA-Z0-9._-\\[\\]]*";
String username = "dsfsdf_12";
Pattern pattern = Pattern.compile(REGEX);
Matcher matcher = pattern.matcher(username);
if (matcher.matches()) {
System.out.println("matched");
} else {
System.out.println("NOT matched");
}
String REGEX = "[a-zA-Z0-9._\-\[\]\\]*";
The four slashes\ at the end are what allow you to match against the \ character
If you want to test any regexes out, there's a great site online called http://www.regextester.com/ It will allow you to play with regexes so you can test them.
Escape also the closing brackets:
String REGEX = "[a-zA-Z0-9._-\\[\\]]*";
You have to escape the ] in your character class as shown below:
[a-zA-Z0-9._-\\[\\]]*

How to provide regular expression for matching $$

I am having String str = "$$\\frac{6}{8}$$"; I want to match for strings using starting with '$$' and ending with '$$'
How to write the regular expression for this?
Try using the regex:
^\$\$.*\$\$$
which in Java will be:
^\\$\\$.*\\$\\$$
A $ is a regex metacharacter used as end anchor. To mean a literal $ you need to escape it with a backslash \.
In Java \ is the escape character in a String and also in the regular expression. So to make a \ reach the regex engine you need to have \\ in the String.
See it
Use this regex string:
"^$$.*$$$"
The ^ anchors the expression to the start of the string being matched, and the last $ anchors it to the end. All other $ characters are taken literally.
You may want something like this:
final String str = "$$\\frac{6}{8}$$";
final String latex = "A display math formula " + str + " and once again " + str + " and another one " + "$$42.$$";
final Pattern pattern = Pattern.compile("\\$\\$([^$]|\\$[^$])+\\$\\$");
final Matcher m = pattern.matcher(latex);
while (m.find()) {
System.out.println(m.group());
}

Categories