String url = "d://test////hello\\\hello";
String separator = File.separator;
url = url.replaceAll("\\*", separator);
url = url.replaceAll("/+", separator);
I want to format those url, but error occurs when i attempt to use replaceAll("/+", separator). and i attempt to escaped "/" as "\\/", it still doesn't work..
This is the Exception from console:
Exception in thread "main" **java.lang.StringIndexOutOfBoundsException**: String index out of range: 1
at java.lang.String.charAt(String.java:686)
at java.util.regex.Matcher.appendReplacement(Matcher.java:703)
at java.util.regex.Matcher.replaceAll(Matcher.java:813)
at java.lang.String.replaceAll(String.java:2189)
Now it works
String separator = null;
if(File.separator.equals("/")) {
separator = "/";
url = url.replaceAll("/+", separator);
url = url.replaceAll("\\\\+", separator);
} else {
separator = Matcher.quoteReplacement(File.separator);
url = url.replaceAll("/+", separator);
url = url.replaceAll("\\+", separator);
}
:) it works in javascript
var i = "d:\\ad////df";
alert(i.replace(/\/+/g, '\\'));
Your platform is Windows right? So File.separator will be a backslash right?
The explanation is that the 2nd argument of String.replaceAll is not a simple String. Rather it is a replacement pattern ...
The javadoc says:
"Note that backslashes (\) and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string; see Matcher.replaceAll. Use Matcher.quoteReplacement(java.lang.String) to suppress the special meaning of these characters, if desired. "
So your replacement String that consists of a single backslash is an invalid literal replacement string. You need to quote the separator String ... like the javadoc says.
(It is a little surprising that you get that particular exception. I can imagine how it could happen, but I'd have thought that they'd deal with bad escapes more elegantly. Mind you, if this was reported as a "bug", Oracle would probably not fix it. A fix would break backwards compatibility.)
Try:
url = url.replaceAll("\\\\+", separator);
You need 4 backward slashes. Escape once for Java string and once for regex meta-character. That is for regex you need two backward slashes \\, and in string you need to escape both of them with another two.
Also, the quantifier * means zero or more, you need to use +.
Related
Context: GoogleBooks API returing unexpected thumbnail url
Ok so i found the reason for the problem i had in that question
what i found was the returned url from the googlebooks api was something like this:
http:/\/books.google.com\/books\/content?id=0DwKEBD5ZBUC&printsec=frontcover&img=1&zoom=5&source=gbs_api
Going to that url would return a error, but if i replaced the "\ /"s with "/" it would return the proper url
is there something like a java/kotlin regex that would change this http:/\/books.google.com\/ to this http://books.google.com/
(i know a bit of regex in python but I'm clueless in java/kotlin)
thank you
You can use triple-quoted string literals (that act as raw string literals where backslashes are treated as literal chars and not part of string escape sequences) + kotlin.text.replace:
val text = """http:/\/books.google.com\/books\/content?id=0DwKEBD5ZBUC&printsec=frontcover&img=1&zoom=5&source=gbs_api"""
print(text.replace("""\/""", "/"))
Output:
http://books.google.com/books/content?id=0DwKEBD5ZBUC&printsec=frontcover&img=1&zoom=5&source=gbs_api
See the Kotlin demo.
NOTE: you will need to double the backslashes in the regular string literal:
print(text.replace("\\/", "/"))
If you need to use this "backslash + slash" pattern in a regex you will need 2 backslashes in the triple-quoted string literal and 4 backslashes in a regular string literal:
print(text.replace("""\\/""".toRegex(), "/"))
print(text.replace("\\\\/".toRegex(), "/"))
NOTE: There is no need to escape / forward slash in a Kotlin regex declaration as it is not a special regex metacharacter and Kotlin regexps are defined with string literals, not regex literals, and thus do not need regex delimiters (/ is often used as a regex delimiter char in environments that support this notation).
You could match the protocol, and then replace the backslash followed by a forward slash by a forward slash only
https?:\\?/\\?/\S+
Pattern in Java
String regex = "https?:\\\\?/\\\\?/\\S+";
Java demo | regex demo
For example in Java:
String regex = "https?:\\\\?/\\\\?/\\S+";
String string = "http:/\\/books.google.com\\/books\\/content?id=0DwKEBD5ZBUC&printsec=frontcover&img=1&zoom=5&source=gbs_api";
if(string.matches(regex)) {
System.out.println(string.replace("\\/", "/"));
}
}
Output
http://books.google.com/books/content?id=0DwKEBD5ZBUC&printsec=frontcover&img=1&zoom=5&source=gbs_api
I had same problem and my url was:
String url="https:\\/\\/www.dailymotion.com\\/cdn\\/H264-320x240\\/video\\/x83iqpl.mp4?sec=zaJEh8Q2ahOorzbKJTOI7b5FX3QT8OXSbnjpCAnNyUWNHl1kqXq0D9F8iLMFJ0ocg120B-dMbEE5kDQJN4hYIA";
I solved it with this code:
replace("\\/", "/");
I have a variable:
String content = "<xxx.xx.name>xxx.xxx.com:111</xxx.xx.name>";
String destination = "\\$\\{VAR\\}";
String source = "xxx.xxx.com:111";
content = content.replaceAll(source, destination);
Result:
result = {IllegalArgumentException#781} Method threw 'java.lang.IllegalArgumentException' exception.
detailMessage = "Illegal group reference"
cause = {IllegalArgumentException#781} "java.lang.IllegalArgumentException: Illegal group reference"
stackTrace = {StackTraceElement[5]#783}
suppressedExceptions = {Collections$UnmodifiableRandomAccessList#773} size = 0
But if I do:
content = content.replaceAll(source,"\\$\\{VAR\\}");
all is working fine. How can I mimic or fix the replaceAll?
From the documentation of String.replaceAll(String, String):
Note that backslashes (\) and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string;
Emphasis on the may, depending on your Java version this may fail (though I fail to reproduce your problem with Java 8, 11, 12, 15 and even the early access Java 16)
You can use Matcher.quoteReplacement(String) to escape your \ and $ chars in the replacement string, as described later on in the javadoc:
Use Matcher.quoteReplacement(java.lang.String) to suppress the special meaning of these characters, if desired.
So change your code to this (assuming you want to replace the contents with ${VAR} and not with \${VAR\}):
String content = "<xxx.xx.name>xxx.xxx.com:111</xxx.xx.name>";
String destination = Matcher.quoteReplacement("${VAR}");
String source = "xxx.xxx.com:111";
content = content.replaceAll(source, destination);
Which results in:
<xxx.xx.name>${VAR}</xxx.xx.name>
DEMO
Does Java have a built-in way to escape arbitrary text so that it can be included in a regular expression? For example, if my users enter "$5", I'd like to match that exactly rather than a "5" after the end of input.
Since Java 1.5, yes:
Pattern.quote("$5");
Difference between Pattern.quote and Matcher.quoteReplacement was not clear to me before I saw following example
s.replaceFirst(Pattern.quote("text to replace"),
Matcher.quoteReplacement("replacement text"));
It may be too late to respond, but you can also use Pattern.LITERAL, which would ignore all special characters while formatting:
Pattern.compile(textToFormat, Pattern.LITERAL);
I think what you're after is \Q$5\E. Also see Pattern.quote(s) introduced in Java5.
See Pattern javadoc for details.
First off, if
you use replaceAll()
you DON'T use Matcher.quoteReplacement()
the text to be substituted in includes a $1
it won't put a 1 at the end. It will look at the search regex for the first matching group and sub THAT in. That's what $1, $2 or $3 means in the replacement text: matching groups from the search pattern.
I frequently plug long strings of text into .properties files, then generate email subjects and bodies from those. Indeed, this appears to be the default way to do i18n in Spring Framework. I put XML tags, as placeholders, into the strings and I use replaceAll() to replace the XML tags with the values at runtime.
I ran into an issue where a user input a dollars-and-cents figure, with a dollar sign. replaceAll() choked on it, with the following showing up in a stracktrace:
java.lang.IndexOutOfBoundsException: No group 3
at java.util.regex.Matcher.start(Matcher.java:374)
at java.util.regex.Matcher.appendReplacement(Matcher.java:748)
at java.util.regex.Matcher.replaceAll(Matcher.java:823)
at java.lang.String.replaceAll(String.java:2201)
In this case, the user had entered "$3" somewhere in their input and replaceAll() went looking in the search regex for the third matching group, didn't find one, and puked.
Given:
// "msg" is a string from a .properties file, containing "<userInput />" among other tags
// "userInput" is a String containing the user's input
replacing
msg = msg.replaceAll("<userInput \\/>", userInput);
with
msg = msg.replaceAll("<userInput \\/>", Matcher.quoteReplacement(userInput));
solved the problem. The user could put in any kind of characters, including dollar signs, without issue. It behaved exactly the way you would expect.
To have protected pattern you may replace all symbols with "\\\\", except digits and letters. And after that you can put in that protected pattern your special symbols to make this pattern working not like stupid quoted text, but really like a patten, but your own. Without user special symbols.
public class Test {
public static void main(String[] args) {
String str = "y z (111)";
String p1 = "x x (111)";
String p2 = ".* .* \\(111\\)";
p1 = escapeRE(p1);
p1 = p1.replace("x", ".*");
System.out.println( p1 + "-->" + str.matches(p1) );
//.*\ .*\ \(111\)-->true
System.out.println( p2 + "-->" + str.matches(p2) );
//.* .* \(111\)-->true
}
public static String escapeRE(String str) {
//Pattern escaper = Pattern.compile("([^a-zA-z0-9])");
//return escaper.matcher(str).replaceAll("\\\\$1");
return str.replaceAll("([^a-zA-Z0-9])", "\\\\$1");
}
}
Pattern.quote("blabla") works nicely.
The Pattern.quote() works nicely. It encloses the sentence with the characters "\Q" and "\E", and if it does escape "\Q" and "\E".
However, if you need to do a real regular expression escaping(or custom escaping), you can use this code:
String someText = "Some/s/wText*/,**";
System.out.println(someText.replaceAll("[-\\[\\]{}()*+?.,\\\\\\\\^$|#\\\\s]", "\\\\$0"));
This method returns: Some/\s/wText*/\,**
Code for example and tests:
String someText = "Some\\E/s/wText*/,**";
System.out.println("Pattern.quote: "+ Pattern.quote(someText));
System.out.println("Full escape: "+someText.replaceAll("[-\\[\\]{}()*+?.,\\\\\\\\^$|#\\\\s]", "\\\\$0"));
^(Negation) symbol is used to match something that is not in the character group.
This is the link to Regular Expressions
Here is the image info about negation:
Does Java have a built-in way to escape arbitrary text so that it can be included in a regular expression? For example, if my users enter "$5", I'd like to match that exactly rather than a "5" after the end of input.
Since Java 1.5, yes:
Pattern.quote("$5");
Difference between Pattern.quote and Matcher.quoteReplacement was not clear to me before I saw following example
s.replaceFirst(Pattern.quote("text to replace"),
Matcher.quoteReplacement("replacement text"));
It may be too late to respond, but you can also use Pattern.LITERAL, which would ignore all special characters while formatting:
Pattern.compile(textToFormat, Pattern.LITERAL);
I think what you're after is \Q$5\E. Also see Pattern.quote(s) introduced in Java5.
See Pattern javadoc for details.
First off, if
you use replaceAll()
you DON'T use Matcher.quoteReplacement()
the text to be substituted in includes a $1
it won't put a 1 at the end. It will look at the search regex for the first matching group and sub THAT in. That's what $1, $2 or $3 means in the replacement text: matching groups from the search pattern.
I frequently plug long strings of text into .properties files, then generate email subjects and bodies from those. Indeed, this appears to be the default way to do i18n in Spring Framework. I put XML tags, as placeholders, into the strings and I use replaceAll() to replace the XML tags with the values at runtime.
I ran into an issue where a user input a dollars-and-cents figure, with a dollar sign. replaceAll() choked on it, with the following showing up in a stracktrace:
java.lang.IndexOutOfBoundsException: No group 3
at java.util.regex.Matcher.start(Matcher.java:374)
at java.util.regex.Matcher.appendReplacement(Matcher.java:748)
at java.util.regex.Matcher.replaceAll(Matcher.java:823)
at java.lang.String.replaceAll(String.java:2201)
In this case, the user had entered "$3" somewhere in their input and replaceAll() went looking in the search regex for the third matching group, didn't find one, and puked.
Given:
// "msg" is a string from a .properties file, containing "<userInput />" among other tags
// "userInput" is a String containing the user's input
replacing
msg = msg.replaceAll("<userInput \\/>", userInput);
with
msg = msg.replaceAll("<userInput \\/>", Matcher.quoteReplacement(userInput));
solved the problem. The user could put in any kind of characters, including dollar signs, without issue. It behaved exactly the way you would expect.
To have protected pattern you may replace all symbols with "\\\\", except digits and letters. And after that you can put in that protected pattern your special symbols to make this pattern working not like stupid quoted text, but really like a patten, but your own. Without user special symbols.
public class Test {
public static void main(String[] args) {
String str = "y z (111)";
String p1 = "x x (111)";
String p2 = ".* .* \\(111\\)";
p1 = escapeRE(p1);
p1 = p1.replace("x", ".*");
System.out.println( p1 + "-->" + str.matches(p1) );
//.*\ .*\ \(111\)-->true
System.out.println( p2 + "-->" + str.matches(p2) );
//.* .* \(111\)-->true
}
public static String escapeRE(String str) {
//Pattern escaper = Pattern.compile("([^a-zA-z0-9])");
//return escaper.matcher(str).replaceAll("\\\\$1");
return str.replaceAll("([^a-zA-Z0-9])", "\\\\$1");
}
}
Pattern.quote("blabla") works nicely.
The Pattern.quote() works nicely. It encloses the sentence with the characters "\Q" and "\E", and if it does escape "\Q" and "\E".
However, if you need to do a real regular expression escaping(or custom escaping), you can use this code:
String someText = "Some/s/wText*/,**";
System.out.println(someText.replaceAll("[-\\[\\]{}()*+?.,\\\\\\\\^$|#\\\\s]", "\\\\$0"));
This method returns: Some/\s/wText*/\,**
Code for example and tests:
String someText = "Some\\E/s/wText*/,**";
System.out.println("Pattern.quote: "+ Pattern.quote(someText));
System.out.println("Full escape: "+someText.replaceAll("[-\\[\\]{}()*+?.,\\\\\\\\^$|#\\\\s]", "\\\\$0"));
^(Negation) symbol is used to match something that is not in the character group.
This is the link to Regular Expressions
Here is the image info about negation:
I have the following String
String valueExpression = "value1\\$\\$bla";
i would like it to be parsed to:
value1$$bla
when I try to do:
valueExpression.replaceAll("\\\\$", "\\$");
I get it the same, and when I try to do:
valueExpression.replaceAll("\\$", "$");
I get an error IndexOutOfBound
How can I replace it in regex?
The string is dynamic so I can't change the content of the string valueExpression to something static.
Thanks
valueExpression.replaceAll("\\\\[$]", "\\$"); should achieve what you are looking for.
Simplest approach seems to be
valueExpression.replace("\\$", "$")
which is similar to
valueExpression.replaceAll(Pattern.quote("\\$"), Matcher.quoteReplacement("$"))
which means that it automatically escapes all regex matacharacters from both parts (target and replacement) letting you use simple literals.
BTW lets not forget that String is immutable so its methods like replace can't change its state (can't change characters it stores) but will create new String with replaced characters.
So you want to use
valueExpression = valueExpression.replace("\\$", "$");
Example:
String valueExpression = "value1\\$\\$bla";
System.out.println(valueExpression.replace("\\$", "$"));
output: value1$$bla
You want a string containing \\\$ (double backslash to get a literal backslash, and a backslash to escape the $). To write that quoted as a string in Java you should escape each backslash with another backslash. So you would write that as "\\\\\\$".
Ie.
valueExpression.replaceAll("\\\\\\$", "\\$");
Either use direct string replacement:
valueExpression.replace("\\$", "$");
or you need to escape group reference in the replacement string:
valueExpression.replaceAll("\\$", "\\$");