Removing literal character in regex - java

I have the following string
\Qpipe,name=office1\E
And I am using a simplified regex library that doesn't support the \Q and \E.
I tried removing them
s.replaceAll("\\Q", "").replaceAll("\\E", "")
However, I get the error Caused by: java.util.regex.PatternSyntaxException: Illegal/unsupported escape sequence near index 1
\E
^
Any ideas?

\ is the special escape character in both Java string and regex engine. To pass a literal \ to the regex engine you need to have \\\\ in the Java string. So try:
s.replaceAll("\\\\Q", "").replaceAll("\\\\E", "")
Alternatively and a simpler way would be to use the replace method which takes string and not regex:
s.replace("\\Q", "").replace("\\E", "")

Use the Pattern.quote() function to escape special characters in regex for example
s.replaceAll(Pattern.quote("\Q"), "")

replaceAll takes a regular expression string. Instead, just use replace which takes a literal string. So myRegexString.replace("\\Q", "").replace("\\E", "").
But that still leaves you with the problem of quoting special regex characters for your simplified regex library.

String.replaceAll() takes a regular expression as parameter, so you need to escape your backslash twice:
s.replaceAll("\\\Q", "").replaceAll("\\\\E", "");

You can also use the below. I used this because i was matching and replacing a text wrapped and the Q & E would stay in the pattern. This way it doesn't.
final int flags = Pattern.LITERAL;
regex = "My regex";
pattern = Pattern.compile( regex, flags );

Related

Can't use Regex in Java because of escape sequence error, how to remove the error

I have this regex :
^(([A-Z]:)|((\\|/){1,2}\w+)\$?)((\\|/)(\w[\w ]*.*))+\.([txt|exe]+)$
but every time I assign it to any string, Eclipse returns me invalid escape sequences, I have inserted a backward slash but it gives me the same error.
How to assign the above expression to string in java?
Replace all "\\" with "\\\\". Java has no language support for regular expressions. So you'll need "\\" to get a backslash from the Compiler into the String. If the regular expression shall contain an escaped backslash, you need "\\\\".
final String re = "^(([A-Z]:)|((\\\\|/){1,2}\\w+)\\$?)((\\\\|/)(\\w[\\w ]*.*))+\\.([txt|exe]+)$"
Try the following:
String regex = "^(([A-Z]:)|((\\\\|/){1,2}\\w+)\\$?)((\\\\|/)(\\w[\\w ]*.*))+\\.([txt|exe]+)$";
The backslash character itself needs to be escaped as well, so you would end up with four \ characters.

How to escape characters in a regular expression

When I use the following code I've got an error:
Matcher matcher = pattern.matcher("/Date\(\d+\)/");
The error is :
invalid escape sequence (valid ones are \b \t \n \f \r \" \' \\ )
I have also tried to change the value in the brackets to('/Date\(\d+\)/'); without any success.
How can i avoid this error?
You need to double-escape your \ character, like this: \\.
Otherwise your String is interpreted as if you were trying to escape (.
Same with the other round bracket and the d.
In fact it seems you are trying to initialize a Pattern here, while pattern.matcher references a text you want your Pattern to match.
Finally, note that in a Pattern, escaped characters require a double escape, as such:
\\(\\d+\\)
Also, as Rohit says, Patterns in Java do not need to be surrounded by forward slashes (/).
In fact if you initialize a Pattern like that, it will interpret your Pattern as starting and ending with literal forward slashes.
Here's a small example of what you probably want to do:
// your input text
String myText = "Date(123)";
// your Pattern initialization
Pattern p = Pattern.compile("Date\\(\\d+\\)");
// your matcher initialization
Matcher m = p.matcher(myText);
// printing the output of the match...
System.out.println(m.find());
Output:
true
Your regex is correct by itself, but in Java, the backslash character itself needs to be escaped.
Thus, this regex:
/Date\(\d+\)/
Must turn into this:
/Date\\(\\d+\\)/
One backslash is for escaping the parenthesis or d. The other one is for escaping the backslash itself.
The error message you are getting arises because Java thinks you're trying to use \( as a single escape character, like \n, or any of the other examples. However, \( is not a valid escape sequence, and so Java complains.
In addition, the logic of your code is probably incorrect. The argument to matcher should be the text to search (for example, "/Date(234)/Date(6578)/"), whereas the variable pattern should contain the pattern itself. Try this:
String textToMatch = "/Date(234)/Date(6578)/";
Pattern pattern = pattern.compile("/Date\\(\\d+\\)/");
Matcher matcher = pattern.matcher(textToMatch);
Finally, the regex character class \d means "one single digit." If you are trying to refer to the literal phrase \\d, you would have to use \\\\d to escape this. However, in that case, your regex would be a constant, and you could use textToMatch.indexOf and textToMatch.contains more easily.
To escape regex in java, you can also use Pattern.quote()

Android '\' special character

I have an android application where I have to find out if my user entered the special character '\' on a string. But i'm not obtaining success by using the string.replaceAll() method, because Java recognizes \ as the end of the string, instead of the " closing tag. Does anyone have suggestions of how can I fix this?
Here is an example of how I tried to do this:
private void ReplaceSpecial(String text) {
if (text.trim().length() > 0) {
text = text.replaceAll("\", "%5C");
}
It does not work because Java doesn't allow me. Any suggestions?
Try this: You have to use escape character '\'
text = text.replaceAll("\\\\", "%5C");
Try
text = text.replaceAll("\\\\", "%5C");
replaceAll uses regex syntax where \ is special character, so you need to escape it. To do it you need to pass \\ to regex engine but to create string representing regex \\ you need to write it as "\\\\" (\ is also special character in String and requires another escaping for each \)
To avoid this regex mess you can just use replace which is working on literals
text = text.replace("\\", "%5C");
The first parameter to replaceAll is interpreted as a regular expression, so you actually need four backslashes:
text = text.replaceAll("\\\\", "%5C");
four backslashes in a string literal means two backslashes in the actual String, which in turn represents a regular expression that matches a single backslash character.
Alternatively, use replace instead of replaceAll, as recommended by Pshemo, which treats its first argument as a literal string instead of a regex.
text = text.replaceAll("\", "%5C");
Should be:
text = text.replaceAll("\\\\", "%5C");
Why?
Since the backward slash is an escape character. If you want to represent a real backslash, you should use double \ (\\)
Now the first argument of replaceAll is a regular expression. So you need to escape this too! (Which will end up with 4 backslashes).
Alternatively you can use replace which doesn't expect a regex, so you can do:
text = text.replace("\\", "%5C");
First, since "\" is the escape character in Java, you need to use two backslashes to get one backslash. Second, since the replaceAll() method takes a regular expression as a parameter, you will need to escape THAT backslash as well. Thus you need to escape it by using
text = text.replaceAll("\\\\", "%5C");
I could be late but not the least.
Add \\\\ following regex to enable \.
Sample regex:
private val specialCharacters = "-#%\\[\\}+'!/#$^?:;,\\(\"\\)~`.*=&\\{>\\]<_\\\\"
private val PATTERN_SPECIAL_CHARACTER = "^(?=.*[$specialCharacters]).{1,20}$"
Hope it helps.

how to replace a string in Java

I have a question about using replaceAll() function.
if a string has parentheses as a pair, replace it with "",
while(S.contains("()"))
{
S = S.replaceAll("\\(\\)", "");
}
but why in replaceAll("\\(\\)", "");need to use \\(\\)?
Because as noted by the javadocs, the argument is a regular expression.
Parenthesis in a regular expression are used for grouping. If you're going to match parenthesis as part of a regular expression they must be escaped.
It's because replaceAll expects a regex and ( and ) have a special meaning in a regex expressions and need to be escaped.
An alternative is to use replace, which counter-intuitively does the same thing as replaceAll but takes a string as an input instead of a regex:
S = S.replace("()", "");
First, your code can be replaced with:
S = S.replace("()", "");
without the while loop.
Second, the first argument to .replaceAll() is a regular expression, and parens are special tokens in regular expressions (they are grouping operators).
And also, .replaceAll() replaces all occurrences, so you didn't even need the while loop here. Starting with Java 6 you could also have written:
S = S.replaceAll("\\Q()\\E", "");
It is let as an exercise to the reader as to what \Q and \E are: http://regularexpressions.info gives the answer ;)
S = S.replaceAll("\(\)", "") = the argument is a regular expression.
Because the method's first argument is a regex expression, and () are special characters in regex, so you need to escape them.
Because parentheses are special characters in regexps, so you need to escape them. To get a literal \ in a string in Java you need to escape it like so : \\.
So () => \(\) => \\(\\)

How to replace a plus character using Java's String.replaceAll method

What's the correct regex for a plus character (+) as the first argument (i.e. the string to replace) to Java's replaceAll method in the String class? I can't get the syntax right.
You need to escape the + for the regular expression, using \.
However, Java uses a String parameter to construct regular expressions, which uses \ for its own escape sequences. So you have to escape the \ itself:
"\\+"
when in doubt, let java do the work for you:
myStr.replaceAll(Pattern.quote("+"), replaceStr);
You'll need to escape the + with a \ and because \ is itself a special character in Java strings you'll need to escape it with another \.
So your regex string will be defined as "\\+" in Java code.
I.e. this example:
String test = "ABCD+EFGH";
test = test.replaceAll("\\+", "-");
System.out.println(test);
Others have already stated the correct method of:
Escaping the + as \\+
Using the Pattern.quote method which escapes all the regex meta-characters.
Another method that you can use is to put the + in a character class. Many of the regex meta characters (., *, + among many others) are treated literally in the character class.
So you can also do:
orgStr.replaceAll("[+]",replaceStr);
Ideone Link
If you want a simple string find-and-replace (i.e. you don't need regex), it may be simpler to use the StringUtils from Apache Commons, which would allow you to write:
mystr = StringUtils.replace(mystr, "+", "plus");
Say you want to replace - with \\\-, use:
text.replaceAll("-", "\\\\\\\\-");
String str="Hello+Hello";
str=str.replaceAll("\\+","-");
System.out.println(str);
OR
String str="Hello+Hello";
str=str.replace(Pattern.quote(str),"_");
System.out.println(str);
How about replacing multiple ‘+’ with an undefined amount of repeats?
Example: test+test+test+1234
(+) or [+] seem to pick on a single literal character but on repeats.

Categories