Why is there a PatternSyntax Excpetion for the following program? - java

I am getting a Pattern Syntax Exception for the following program. I have escaped the backslashes by using "\\", but there is a still an exception saying:
Exception in thread "main" java.util.regex.PatternSyntaxException: Illegal/unsupported escape sequence near index 1
\left(
^
Here is the code:
String[] paren = {"\\big(","\\Big(","\\bigg(","\\Bigg(","\\left("};
for(String x : paren){
if(line.contains(x))
line=line.replaceAll(x, "("); //error on this line
}
Thanks.

\l is an invalid escape sequence and you have unescaped (.
Note that if you want to match a literal backslash, you need to double escape it, and then escape those again because it all resides inside a string literal. That is why "\\l" is being parsed as the regex pattern \l (which is an invalid escape sequence). And "\\b" and "\\B" are parsed as the escape sequences \b and \B which are word- and non-word boundaries.
Assuming you would like to match the literal backslash, try this instead:
{"\\\\big\\(","\\\\Big\\(","\\\\bigg\\(","\\\\Bigg\\(","\\\\left\\("};
but then, your contains(...) call won't work anymore!
Or perhaps better/safer, let Pattern quote/escape your input properly:
String[] paren = {"\\big(","\\Big(","\\bigg(","\\Bigg(","\\left("};
for(String x : paren){
if(line.contains(x)) {
line = line.replaceAll(Pattern.quote(x), "(");
}
}

If your goal is to replace each of literals "\\big(", "\\Big(", "\\bigg(", "\\Bigg(", "\\left(" then avoid using replaceAll because it uses regex as first argument representing value which should be replaced. In your case strings you want to replace contain regex metacharacters like ( or anchors like \\b \\B so even if this would not throw Exception you would not get results you wanted.
Instead use replace (without All suffix) method which will automatically escape all regex metacharacters, so you will avoid problems like unescaped (.
So try with
String[] paren = {"\\big(","\\Big(","\\bigg(","\\Bigg(","\\left("};
for(String x : paren){
if(line.contains(x))
line=line.replace(x, "(");
}

Related

How do I properly create a regex for String.matches() with escape characters?

I am trying to check if a string of length one is any of the following characters: "[", "\", "^", "_", single back-tick "`", or "]".
Right now I am trying to accomplish this with the following if statement:
if (character.matches("[[\\]^_`]")){
isValid = false;
}
When I run my program I get the following error for the if statement:
java.util.regex.PatternSyntaxException: null (in
java.util.regex.Pattern)
What is the correct syntax for a regex with escape characters?
Your list has four characters that need special attention:
^ is the inversion character. It must not be the first character in a character class, or it must be escaped.
\ is the escape character. It must be escaped for direct use.
[ starts a character class, so it must be escaped.
] ends a character class, so it must be escaped.
Here is the "raw" regex:
[\[\]_`\\^]
Since you represent your regex as a Java string literal, all backslashes must be additionally escaped for the Java compiler:
if (character.matches("[\\[\\]_`\\\\^]")){
isValid = false;
}
You need to escape the [, ] and \\ - the [ and ] so that the pattern compiler knows that they are not the special character class delimiters, and \\ because it's already being converted to one backslash because it's in a string literal, so to represent an escaped backslash in a pattern, you need to have no less than four consecutive backslashes.
So the resulting regex should be
"[\\[\\]\\\\^_`]"
(Test it on RegexPlanet - click on the "Java" button to test).

Regex in java: error for str.replace("\s+", " ")

Why the java (1.7) gives me error for the following line?
String str2 = str.replace("\s+", " ");
Error:
Invalid escape sequence (valid ones are \b \t \n \f \r \" \' \\ )
As far as I know "\s+" is a valid regex. Isn't it?
String.replace() will only replace literals, that's the first problem.
The second problem is that \s is not a valid escape sequence in a Java string literal, by definition.
Which means what you wanted was probably "\\s+".
But even then, .replace() won't take that as a regex. You have to use .replaceAll() instead:
s.replaceAll("\\s+", "");
BUT there is another problem. You seem to be using it often... Therefore, use a Pattern instead:
private static final Pattern SPACES = Pattern.compile("\\s+");
// In code...
SPACES.matcher(input).replaceAll("");
FURTHER NOTES:
If what you want is to only replace the first occurrence, then use .replaceFirst(); String has it, and so does Pattern
When you .replace{First,All}() on a String, a new Pattern is recompiled for each and every invocation. Use a Pattern if you have to do repetitive matches!
It's a valid regular expression pattern, but \s is not a valid String literal escape sequence. Escape the \.
String str2 = str.replace("\\s+", " ");
As suggested, String#replace(CharSequence, CharSequence) doesn't consider the arguments you provide as regular expressions. So even if you got the program to compile, it wouldn't do what you seem to want it to do. Check out String#replaceAll(String, String).

converting RegEx into my Java function [duplicate]

This question already has answers here:
Why does this Java regex cause "illegal escape character" errors?
(7 answers)
Closed 2 years ago.
I'm having problems with Java RegEx. That's my regex statement "\"730\"\s+{([^}]+)}" and it works on an regex checking website, but I have trouble getting it to work in Java. That's my current code.
String patternString = '\"730\"\s+{([^}]+)}';
Pattern pattern = Pattern.compile(patternString);
Matcher matcher = pattern.matcher(vdfContentsString);
boolean matches = matcher.matches();
Thanks for advice.
It says "Illegal escape character in character literal".
Single quotes (') declare characters, double quotes (") declare strings, that's why you get the syntax error Illegal escape character in character literal. Second, regex itself syntactically uses the backslash, as in \s for whitespace. Maybe confusing might be the fact that Java also uses \ for character escaping. That's why you need two backslashes (\\s in Java will become \s for the resulting regular expression).
Then you need to take care of special characters in regular expressions: { and } are quantifiers ("repeat n times"), if you want them literally, escape them (\\{ and \\})
So if you want to match a string like "730" {whatever}, use this regular expression:
"730"\s+\{([^}]+)\}
or in Java:
String patternString = "\"730\"\\s+\\{([^}]+)\\}";
Example:
String str = "\"730\" { \"installdir\" \"C:\\Program Files (x86)\\Steam\\steamapps\\common\\Counter-Strike Global Offensive\" \"HasAllLocalContent\" \"1\" \"UpToDate\" \"1\" }";
String patternString = "\"730\"\\s+\\{([^}]+)\\}";
System.out.println(str.matches(patternString)); // true
Exception in thread "main" java.util.regex.PatternSyntaxException: Illegal repetition
Escape { and } as well because in Java Regex Pattern it has special meaning.
String patternString = "\"730\"\\s+\\{([^\\}]+)\\}";
EDIT
String#matches() method looks for whole string if you are looking for sub-string of a long string then use Matcher#find() method and get the result from the groups that is captured by enclosing the pattern inside parenthisis (...).
sample code:
String patternString = "(\"730\"\\s+\\{([^\\}]+)\\})";
Pattern pattern = Pattern.compile(patternString);
Matcher matcher = pattern.matcher(vdfContentsString);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
{, } are Metacharacters (See HERE for metacharacters) and need to be escaped with \\, hence, \\{ .. \\}.
\ is an escape character, while \s, \w, \d etc (See HERE for a list) are metacharacters, therefore, as mentioned above, these need to be escaped as well, hence, \\s+
instead of [^\\}], i would suggest (.+?)}
This is working:
String patternString = '\\\"730\\\"\\s+\\{(.+?)\\}';
The above is the required Java string which gets parsed into the following regular expression: \"730\"\s+\{(.+?)\}, and then it can be used to match the input string. Tadan!
two levels of parsing!

why does replaceAll throw an exception

i have a string where i want to get rid of brackets
this is my string "(name)"
and i want to get "name"
the same thing without the brackets
i had String s = "(name)";
i wrote
s = s.replaceAll("(","");
s = s.replaceAll(")","");
and i get an exception for that
Exception in thread "main" java.util.regex.PatternSyntaxException: Unclosed group near index 1
(
how do i get rid of the brackets?
Parenthesis characters ( and ) delimit the bounds of a capturing group in a regular expression which is used as the first argument in replaceAll. The characters need to be escaped.
s = s.replaceAll("\\(","");
s = s.replaceAll("\\)","");
Better yet, you could simply place the parenthesis in a character class to prevent the characters being interpreted as meta-characters
s = s.replaceAll("[()]","");
s = s.replace("(", "").replace(")", "");
Regex isn't needed here.
If you wanted to use Regex (not sure why you would) you could do something like this:
s = s.replaceAll("\\(", "").replaceAll("\\)", "");
The problem was that ( and ) are meta characters so you need to escape them (assuming you want them to be interpreted as how they appear).
String#replaceAll takes regular expression as argument.
You are using Grouping Meta-characters as regular expression argument.That is why getting error.
Meta-characters are used to group, divide, and perform special operations in patterns.
\ Escape the next meta-character (it becomes a normal/literal character)
^ Match the beginning of the line
. Match any character (except newline)
$ Match the end of the line (or before newline at the end)
| Alternation (‘or’ statement)
() Grouping
[] Custom character class
So use
1.\\( instead of (
2. \\) instead of )
You'll need to escape the brackets like this:
s = s.replaceAll("\\(","");
s = s.replaceAll("\\)","");
You need two slashes since the regex processing engine would need to see a \( to process the bracket as a literal bracket (and not as part of the regex expression), and you'll need to escape the backslash so the regex engine would be able to see it as a backslash.
You need to escape the ( and the ) they have special string literal meaning.
Do it like this:
s = s.replaceAll("\\(","");
s = s.replaceAll("\\)","");
s=s.replace("(","").replace(")","");

How to escape characters in a regular expression

When I use the following code I've got an error:
Matcher matcher = pattern.matcher("/Date\(\d+\)/");
The error is :
invalid escape sequence (valid ones are \b \t \n \f \r \" \' \\ )
I have also tried to change the value in the brackets to('/Date\(\d+\)/'); without any success.
How can i avoid this error?
You need to double-escape your \ character, like this: \\.
Otherwise your String is interpreted as if you were trying to escape (.
Same with the other round bracket and the d.
In fact it seems you are trying to initialize a Pattern here, while pattern.matcher references a text you want your Pattern to match.
Finally, note that in a Pattern, escaped characters require a double escape, as such:
\\(\\d+\\)
Also, as Rohit says, Patterns in Java do not need to be surrounded by forward slashes (/).
In fact if you initialize a Pattern like that, it will interpret your Pattern as starting and ending with literal forward slashes.
Here's a small example of what you probably want to do:
// your input text
String myText = "Date(123)";
// your Pattern initialization
Pattern p = Pattern.compile("Date\\(\\d+\\)");
// your matcher initialization
Matcher m = p.matcher(myText);
// printing the output of the match...
System.out.println(m.find());
Output:
true
Your regex is correct by itself, but in Java, the backslash character itself needs to be escaped.
Thus, this regex:
/Date\(\d+\)/
Must turn into this:
/Date\\(\\d+\\)/
One backslash is for escaping the parenthesis or d. The other one is for escaping the backslash itself.
The error message you are getting arises because Java thinks you're trying to use \( as a single escape character, like \n, or any of the other examples. However, \( is not a valid escape sequence, and so Java complains.
In addition, the logic of your code is probably incorrect. The argument to matcher should be the text to search (for example, "/Date(234)/Date(6578)/"), whereas the variable pattern should contain the pattern itself. Try this:
String textToMatch = "/Date(234)/Date(6578)/";
Pattern pattern = pattern.compile("/Date\\(\\d+\\)/");
Matcher matcher = pattern.matcher(textToMatch);
Finally, the regex character class \d means "one single digit." If you are trying to refer to the literal phrase \\d, you would have to use \\\\d to escape this. However, in that case, your regex would be a constant, and you could use textToMatch.indexOf and textToMatch.contains more easily.
To escape regex in java, you can also use Pattern.quote()

Categories