I'm trying to create a regular expression matcher, but it doesn't work as expected.
String input = "// source C:\\path\\to\\folder";
System.out.println(Pattern.matches("//\\s*source\\s+[a-zA-Z]:(\\[a-zA-Z0-9_-]+)+", input));
It returns false but it should pass. What is wrong with that regex?
Backslashes. That's what is wrong.
System.out.println(Pattern.matches("//\\s*source\\s+[a-zA-Z]:(\\\\[a-zA-Z0-9_-]+)+", input));
^^
In regex, a backslash must be escaped—backslashed. That's two backslashes. Add to that, Java escaping and you must write four backslashes to match one.
You forgot \\ in [a-zA-Z0-9_-]:
String input = "// source C:\\path\\to\\folder";
System.out.println(Pattern.matches("//\\s*source\\s+[a-zA-Z]:(\\\\[a-zA-Z0-9_\\-]+)+", input));
You should use: \\\\ to match a backslash in Java regex:
String input = "// source C:\\path\\to\\folder";
boolean m = Pattern.matches("//\\s*source\\s+[a-zA-Z]:(\\\\[a-zA-Z0-9_-]+)+", input);
//=> true
You need first escaping i.e. \\ for String and another escaping i.e. \\ for underlying regex engine to get a literal \.
Related
I have functionality in my app that should replace some text in json (I have simplified it in the example). Their replacement may contain escaping sequences like \n \b \t etc. which can break the json string when I try to build json with Jackson. So I decided to use Apache's solution - StringEscapeUtils.escapeJava() to escape all escaping sequences. But
Matcher.replaceAll() removes backslashes which added by escapeJava()
There is the code:
public static void main(String[] args) {
String json = "{\"test2\": \"Hello toReplace \\\"test\\\" world\"}";
String replacedJson = Pattern.compile("toReplace")
.matcher(json)
.replaceAll(StringEscapeUtils.escapeJava("replacement \n \b \t"));
System.out.println(replacedJson);
}
Expected Output:
{"test2": "Hello replacement \n \b \t \"test\" world"}
Actual Output:
{"test2": "Hello replacement n b t \"test\" world"}
Why does Matcher.replaceAll() removes backslahes while System.out.println(StringEscapeUtils.escapeJava("replacement \n \b \t")); returns correct output - replacement \n \b \t
StringEscapeUtils.escapeJava("\n") allows you to transform the single newline character \n into two characters: \ and n.
\ is a special character in pattern replacements though, from https://docs.oracle.com/javase/7/docs/api/java/util/regex/Matcher.html#replaceAll(java.lang.String):
Note that backslashes (\) and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string. Dollar signs may be treated as references to captured subsequences as described above, and backslashes are used to escape literal characters in the replacement string.
To have them taken as literal characters, you need to escape it via Matcher.quoteReplacement, from https://docs.oracle.com/javase/7/docs/api/java/util/regex/Matcher.html#quoteReplacement(java.lang.String):
Returns a literal replacement String for the specified String. This method produces a String that will work as a literal replacement s in the appendReplacement method of the Matcher class. The String produced will match the sequence of characters in s treated as a literal sequence. Slashes (\) and dollar signs ($) will be given no special meaning.
So in your case:
.replaceAll(Matcher.quoteReplacement(StringEscapeUtils.escapeJava("replacement \n \b \t")))
If you want a literal backslash in replaceAll, you need to escape it. You can find this in the documentation here
StringEscapeUtils.escapeJava will escape a string suitable for use in Java source code - but it won't allow you to use unescaped strings in your source code.
"replacement \n \b \t"
^ new line
^ backspace
^ tab
If you want literal backslashes in a regular Java string, you need:
"replacement \\n \\b \\t"
Because this is a java string of the replace part of a regular expression for replaceAll, you need:
"replacement \\\\n \\\\b \\\\t"
Try:
String replacedJson = Pattern.compile("toReplace")
.matcher(json)
.replaceAll("replacement \\\\n \\\\b \\\\t")
You have to escape \ as well using Matcher.quoteReplacement().
public static String replaceAll(String json, String regex, String replace) {
return Pattern.compile(regex)
.matcher(json)
.replaceAll(Matcher.quoteReplacement(StringEscapeUtils.escapeJava(replace)));
}
Why the java (1.7) gives me error for the following line?
String str2 = str.replace("\s+", " ");
Error:
Invalid escape sequence (valid ones are \b \t \n \f \r \" \' \\ )
As far as I know "\s+" is a valid regex. Isn't it?
String.replace() will only replace literals, that's the first problem.
The second problem is that \s is not a valid escape sequence in a Java string literal, by definition.
Which means what you wanted was probably "\\s+".
But even then, .replace() won't take that as a regex. You have to use .replaceAll() instead:
s.replaceAll("\\s+", "");
BUT there is another problem. You seem to be using it often... Therefore, use a Pattern instead:
private static final Pattern SPACES = Pattern.compile("\\s+");
// In code...
SPACES.matcher(input).replaceAll("");
FURTHER NOTES:
If what you want is to only replace the first occurrence, then use .replaceFirst(); String has it, and so does Pattern
When you .replace{First,All}() on a String, a new Pattern is recompiled for each and every invocation. Use a Pattern if you have to do repetitive matches!
It's a valid regular expression pattern, but \s is not a valid String literal escape sequence. Escape the \.
String str2 = str.replace("\\s+", " ");
As suggested, String#replace(CharSequence, CharSequence) doesn't consider the arguments you provide as regular expressions. So even if you got the program to compile, it wouldn't do what you seem to want it to do. Check out String#replaceAll(String, String).
I have an android application where I have to find out if my user entered the special character '\' on a string. But i'm not obtaining success by using the string.replaceAll() method, because Java recognizes \ as the end of the string, instead of the " closing tag. Does anyone have suggestions of how can I fix this?
Here is an example of how I tried to do this:
private void ReplaceSpecial(String text) {
if (text.trim().length() > 0) {
text = text.replaceAll("\", "%5C");
}
It does not work because Java doesn't allow me. Any suggestions?
Try this: You have to use escape character '\'
text = text.replaceAll("\\\\", "%5C");
Try
text = text.replaceAll("\\\\", "%5C");
replaceAll uses regex syntax where \ is special character, so you need to escape it. To do it you need to pass \\ to regex engine but to create string representing regex \\ you need to write it as "\\\\" (\ is also special character in String and requires another escaping for each \)
To avoid this regex mess you can just use replace which is working on literals
text = text.replace("\\", "%5C");
The first parameter to replaceAll is interpreted as a regular expression, so you actually need four backslashes:
text = text.replaceAll("\\\\", "%5C");
four backslashes in a string literal means two backslashes in the actual String, which in turn represents a regular expression that matches a single backslash character.
Alternatively, use replace instead of replaceAll, as recommended by Pshemo, which treats its first argument as a literal string instead of a regex.
text = text.replaceAll("\", "%5C");
Should be:
text = text.replaceAll("\\\\", "%5C");
Why?
Since the backward slash is an escape character. If you want to represent a real backslash, you should use double \ (\\)
Now the first argument of replaceAll is a regular expression. So you need to escape this too! (Which will end up with 4 backslashes).
Alternatively you can use replace which doesn't expect a regex, so you can do:
text = text.replace("\\", "%5C");
First, since "\" is the escape character in Java, you need to use two backslashes to get one backslash. Second, since the replaceAll() method takes a regular expression as a parameter, you will need to escape THAT backslash as well. Thus you need to escape it by using
text = text.replaceAll("\\\\", "%5C");
I could be late but not the least.
Add \\\\ following regex to enable \.
Sample regex:
private val specialCharacters = "-#%\\[\\}+'!/#$^?:;,\\(\"\\)~`.*=&\\{>\\]<_\\\\"
private val PATTERN_SPECIAL_CHARACTER = "^(?=.*[$specialCharacters]).{1,20}$"
Hope it helps.
I want to replace a special character " with \" in string.
I tried str = str.replaceAll("\"","\\\");
But this doesnt work.
The closing quotes are missing in the 2nd parameter. Change to:
str = str.replaceAll("\"","\\\\\"");
Also see this example.
String.replaceAll() API:
Replaces each substring of this string that matches the given regular
expression with the given replacement.
An invocation of this method of the form str.replaceAll(regex, repl)
yields exactly the same result as the expression
Pattern.compile(regex).matcher(str).replaceAll(repl)
Note that backslashes () and dollar signs ($) in the replacement
string may cause the results to be different than if it were being
treated as a literal replacement string; see Matcher.replaceAll. Use
Matcher.quoteReplacement(java.lang.String) to suppress the special
meaning of these characters, if desired.
Btw, it is duplicated question.
You have to escape the \ by doubling it:\\
Code example:
String tt = "\\\\terte\\";
System.out.println(tt);
System.out.println(tt.replaceAll("\\\\", "|"));
This gives the following output:
\\terte\
||terte|
I have a string:
HLN (Formerly Headline News)
I want to remove everything inside the parens and the parens themselves, leaving only:
HLN
I've tried to do this with a regex, but my difficulty is with this pattern:
"(.+?)"
When I use it, it always gives me a PatternSyntaxException. How can I fix my regex?
Because parentheses are special characters in regexps you need to escape them to match them explicitly.
For example:
"\\(.+?\\)"
String foo = "(x)()foo(x)()";
String cleanFoo = foo.replaceAll("\\([^\\(]*\\)", "");
// cleanFoo value will be "foo"
The above removes empty and non-empty parenthesis from either side of the string.
plain regex:
\([^\(]*\)
You can test here: http://www.regexplanet.com/simple/index.html
My code is based on previous answers
You could use the following regular expression to find parentheticals:
\([^)]*\)
the \( matches on a left parenthesis, the [^)]* matches any number of characters other than the right parenthesis, and the \) matches on a right parenthesis.
If you're including this in a java string, you must escape the \ characters like the following:
String regex = "\\([^)]*\\)";
String foo = "bar (baz)";
String boz = foo.replaceAll("\\(.+\\)", ""); // or replaceFirst
boz is now "bar "