why does replaceAll throw an exception - java

i have a string where i want to get rid of brackets
this is my string "(name)"
and i want to get "name"
the same thing without the brackets
i had String s = "(name)";
i wrote
s = s.replaceAll("(","");
s = s.replaceAll(")","");
and i get an exception for that
Exception in thread "main" java.util.regex.PatternSyntaxException: Unclosed group near index 1
(
how do i get rid of the brackets?

Parenthesis characters ( and ) delimit the bounds of a capturing group in a regular expression which is used as the first argument in replaceAll. The characters need to be escaped.
s = s.replaceAll("\\(","");
s = s.replaceAll("\\)","");
Better yet, you could simply place the parenthesis in a character class to prevent the characters being interpreted as meta-characters
s = s.replaceAll("[()]","");

s = s.replace("(", "").replace(")", "");
Regex isn't needed here.
If you wanted to use Regex (not sure why you would) you could do something like this:
s = s.replaceAll("\\(", "").replaceAll("\\)", "");
The problem was that ( and ) are meta characters so you need to escape them (assuming you want them to be interpreted as how they appear).

String#replaceAll takes regular expression as argument.
You are using Grouping Meta-characters as regular expression argument.That is why getting error.
Meta-characters are used to group, divide, and perform special operations in patterns.
\ Escape the next meta-character (it becomes a normal/literal character)
^ Match the beginning of the line
. Match any character (except newline)
$ Match the end of the line (or before newline at the end)
| Alternation (‘or’ statement)
() Grouping
[] Custom character class
So use
1.\\( instead of (
2. \\) instead of )

You'll need to escape the brackets like this:
s = s.replaceAll("\\(","");
s = s.replaceAll("\\)","");
You need two slashes since the regex processing engine would need to see a \( to process the bracket as a literal bracket (and not as part of the regex expression), and you'll need to escape the backslash so the regex engine would be able to see it as a backslash.

You need to escape the ( and the ) they have special string literal meaning.
Do it like this:
s = s.replaceAll("\\(","");
s = s.replaceAll("\\)","");

s=s.replace("(","").replace(")","");

Related

what is missing in my java regex?

I want to fetch
http://d1oiazdc2hzjcz.cloudfront.net/promotions/precious/2x/p_608_o_6288_precious_image_1419866866.png
from
url(http://d1oiazdc2hzjcz.cloudfront.net/promotions/precious/2x/p_608_o_6288_precious_image_1419866866.png)
I have tried this code:
String a = "";
Pattern pattern = Pattern.compile("url(.*)");
Matcher matcher = pattern.matcher(imgpath);
if (matcher.find()) {
a = (matcher.group(1));
}
return a;
but a == (http://d1oiazdc2hzjcz.cloudfront.net/promotions/precious/2x/p_639_o_4746_precious_image_1419867529.png)
how can I fine tune it?
Why use a regular expression to begin with?
Given
final String s = "url(http://d1oiazdc2hzjcz.cloudfront.net/promotions/precious/2x/p_608_o_6288_precious_image_1419866866.png)";
If the string is always the same format a simple substring(4,s.length()-1) would be better.
That said, if you insist on a regular expression:
You have to escape the ( with \( so in Java ( you have to escape the \ ) it would be \\( same with the ).
Then you can get the grouping with url\\((.+)\\), test it here!
Learn to use RegEx101.com before coming here, it will point out errors like this immediately.
As you already seem to know ( and )` represents groups which means that in regex
url(.*)
(.*) will place everything after url in group 1, which in case of
url(http://d1oiazdc2hzjcz.cloudfront.net/promotions/precious/2x/p_608_o_6288_precious_image_1419866866.png)
will be
(http://d1oiazdc2hzjcz.cloudfront.net/promotions/precious/2x/p_608_o_6288_precious_image_1419866866.png)
If you want to exclude ( and ) from match you need to add their literals to regex, which means you need to escape them. There are many things to do it, like adding \ before each of them, or surrounding them with [ ].
Other problem with your regex is that .* finds maximal potential match but since . represents any character (except line separators) it can also include ( and ). To solve this problem you can make * quantifier reluctant by adding ? after it so your final regex can be written as string
"url\\((.*?)\\)"
---------------
url
\\( - ( literal
(.*?) - group 1
\\) - ) literal
or you can use instead of . character class which will accept all characters except ) like
"url\\(([^)]*)\\)"
Try this regex:
url\((.*?)\)
The outermost parentheses are escaped so they will be matched literally. The inner parentheses are for capturing a group. The question mark after the .* is to make the match lazy, so the first closing parenthesis found will end the group.
Note that to use this regex in Java, you'll have to additionally escape the backslashes in order to express the above regex as a string literal:
String regex = "url\\((.*?)\\)";
You need to escape the () to match the parenthesis in the string, and then add another set of () around the part you want to pull out in group 1, the actual url. I also changed the part inside the parenthesis to [^)]*, which will match everything until it finds a ). See below:
url\(([^)]*)\)

Why is there a PatternSyntax Excpetion for the following program?

I am getting a Pattern Syntax Exception for the following program. I have escaped the backslashes by using "\\", but there is a still an exception saying:
Exception in thread "main" java.util.regex.PatternSyntaxException: Illegal/unsupported escape sequence near index 1
\left(
^
Here is the code:
String[] paren = {"\\big(","\\Big(","\\bigg(","\\Bigg(","\\left("};
for(String x : paren){
if(line.contains(x))
line=line.replaceAll(x, "("); //error on this line
}
Thanks.
\l is an invalid escape sequence and you have unescaped (.
Note that if you want to match a literal backslash, you need to double escape it, and then escape those again because it all resides inside a string literal. That is why "\\l" is being parsed as the regex pattern \l (which is an invalid escape sequence). And "\\b" and "\\B" are parsed as the escape sequences \b and \B which are word- and non-word boundaries.
Assuming you would like to match the literal backslash, try this instead:
{"\\\\big\\(","\\\\Big\\(","\\\\bigg\\(","\\\\Bigg\\(","\\\\left\\("};
but then, your contains(...) call won't work anymore!
Or perhaps better/safer, let Pattern quote/escape your input properly:
String[] paren = {"\\big(","\\Big(","\\bigg(","\\Bigg(","\\left("};
for(String x : paren){
if(line.contains(x)) {
line = line.replaceAll(Pattern.quote(x), "(");
}
}
If your goal is to replace each of literals "\\big(", "\\Big(", "\\bigg(", "\\Bigg(", "\\left(" then avoid using replaceAll because it uses regex as first argument representing value which should be replaced. In your case strings you want to replace contain regex metacharacters like ( or anchors like \\b \\B so even if this would not throw Exception you would not get results you wanted.
Instead use replace (without All suffix) method which will automatically escape all regex metacharacters, so you will avoid problems like unescaped (.
So try with
String[] paren = {"\\big(","\\Big(","\\bigg(","\\Bigg(","\\left("};
for(String x : paren){
if(line.contains(x))
line=line.replace(x, "(");
}

How to escape characters in a regular expression

When I use the following code I've got an error:
Matcher matcher = pattern.matcher("/Date\(\d+\)/");
The error is :
invalid escape sequence (valid ones are \b \t \n \f \r \" \' \\ )
I have also tried to change the value in the brackets to('/Date\(\d+\)/'); without any success.
How can i avoid this error?
You need to double-escape your \ character, like this: \\.
Otherwise your String is interpreted as if you were trying to escape (.
Same with the other round bracket and the d.
In fact it seems you are trying to initialize a Pattern here, while pattern.matcher references a text you want your Pattern to match.
Finally, note that in a Pattern, escaped characters require a double escape, as such:
\\(\\d+\\)
Also, as Rohit says, Patterns in Java do not need to be surrounded by forward slashes (/).
In fact if you initialize a Pattern like that, it will interpret your Pattern as starting and ending with literal forward slashes.
Here's a small example of what you probably want to do:
// your input text
String myText = "Date(123)";
// your Pattern initialization
Pattern p = Pattern.compile("Date\\(\\d+\\)");
// your matcher initialization
Matcher m = p.matcher(myText);
// printing the output of the match...
System.out.println(m.find());
Output:
true
Your regex is correct by itself, but in Java, the backslash character itself needs to be escaped.
Thus, this regex:
/Date\(\d+\)/
Must turn into this:
/Date\\(\\d+\\)/
One backslash is for escaping the parenthesis or d. The other one is for escaping the backslash itself.
The error message you are getting arises because Java thinks you're trying to use \( as a single escape character, like \n, or any of the other examples. However, \( is not a valid escape sequence, and so Java complains.
In addition, the logic of your code is probably incorrect. The argument to matcher should be the text to search (for example, "/Date(234)/Date(6578)/"), whereas the variable pattern should contain the pattern itself. Try this:
String textToMatch = "/Date(234)/Date(6578)/";
Pattern pattern = pattern.compile("/Date\\(\\d+\\)/");
Matcher matcher = pattern.matcher(textToMatch);
Finally, the regex character class \d means "one single digit." If you are trying to refer to the literal phrase \\d, you would have to use \\\\d to escape this. However, in that case, your regex would be a constant, and you could use textToMatch.indexOf and textToMatch.contains more easily.
To escape regex in java, you can also use Pattern.quote()

how to replace a string in Java

I have a question about using replaceAll() function.
if a string has parentheses as a pair, replace it with "",
while(S.contains("()"))
{
S = S.replaceAll("\\(\\)", "");
}
but why in replaceAll("\\(\\)", "");need to use \\(\\)?
Because as noted by the javadocs, the argument is a regular expression.
Parenthesis in a regular expression are used for grouping. If you're going to match parenthesis as part of a regular expression they must be escaped.
It's because replaceAll expects a regex and ( and ) have a special meaning in a regex expressions and need to be escaped.
An alternative is to use replace, which counter-intuitively does the same thing as replaceAll but takes a string as an input instead of a regex:
S = S.replace("()", "");
First, your code can be replaced with:
S = S.replace("()", "");
without the while loop.
Second, the first argument to .replaceAll() is a regular expression, and parens are special tokens in regular expressions (they are grouping operators).
And also, .replaceAll() replaces all occurrences, so you didn't even need the while loop here. Starting with Java 6 you could also have written:
S = S.replaceAll("\\Q()\\E", "");
It is let as an exercise to the reader as to what \Q and \E are: http://regularexpressions.info gives the answer ;)
S = S.replaceAll("\(\)", "") = the argument is a regular expression.
Because the method's first argument is a regex expression, and () are special characters in regex, so you need to escape them.
Because parentheses are special characters in regexps, so you need to escape them. To get a literal \ in a string in Java you need to escape it like so : \\.
So () => \(\) => \\(\\)

Regular Expression for matching parentheses

What is the regular expression for matching '(' in a string?
Following is the scenario :
I have a string
str = "abc(efg)";
I want to split the string at '(' using regular expression.For that i am using
Arrays.asList(Pattern.compile("/(").split(str))
But i am getting the following exception.
java.util.regex.PatternSyntaxException: Unclosed group near index 2
/(
Escaping '(' doesn't seems to work.
Two options:
Firstly, you can escape it using a backslash -- \(
Alternatively, since it's a single character, you can put it in a character class, where it doesn't need to be escaped -- [(]
The solution consists in a regex pattern matching open and closing parenthesis
String str = "Your(String)";
// parameter inside split method is the pattern that matches opened and closed parenthesis,
// that means all characters inside "[ ]" escaping parenthesis with "\\" -> "[\\(\\)]"
String[] parts = str.split("[\\(\\)]");
for (String part : parts) {
// I print first "Your", in the second round trip "String"
System.out.println(part);
}
Writing in Java 8's style, this can be solved in this way:
Arrays.asList("Your(String)".split("[\\(\\)]"))
.forEach(System.out::println);
I hope it is clear.
You can escape any meta-character by using a backslash, so you can match ( with the pattern
\(.
Many languages come with a build-in escaping function, for example, .Net's Regex.Escape or Java's Pattern.quote
Some flavors support \Q and \E, with literal text between them.
Some flavors (VIM, for example) match ( literally, and require \( for capturing groups.
See also: Regular Expression Basic Syntax Reference
For any special characters you should use '\'.
So, for matching parentheses - /\(/
Because ( is special in regex, you should escape it \( when matching. However, depending on what language you are using, you can easily match ( with string methods like index() or other methods that enable you to find at what position the ( is in. Sometimes, there's no need to use regex.

Categories