Special characters are interfering with Java method [duplicate]

Special characters are interfering with Java method [duplicate] - java

This question already has answers here:
String.replaceAll() is not working for some strings
(6 answers)
Closed 2 years ago.
I am using the string.replaceFirst() method in order to replace the first instance of <text> with another string. I used the indexOf method to search for both brackets, and then the replaceFirst method. It works perfectly if text is replaced with any string with an alphanumeric character at the end, but fails to replace when I do something like <some string$>. For reference, the method is
public static String substituteWord(String original, String word) {
int index1 = original.indexOf("<");
int index2 = original.indexOf(">");
storyLine = original.replaceFirst(original.substring(index1,index2+1), word);
return original;
}
The code doesn't look broken, but why does using a dollar sign make this method fail?

Strictly speaking, the first argument to replaceFirst() and replaceAll() is a regular expression, and a dollar sign in the replacement string has a special meaning of 'group x that was matched against (captured by) the regular expression'.
So the solution is to wrap the first argument in Pattern.quote() and the second argument in Matcher.quoteReplacement() to avoid this special behaviour:
String strToReplace = original.substring(index1,index2+1);
storyLine = original.replaceFirst(Pattern.quote(strToReplace), Matcher.quoteReplacement(word));
As an example of when you would want the special behaviour with the dollar sign, consider this example:
str = str.replaceAll("<b>([^<]*)</b>", "<i>$1</i>");
This would take a piece of bold text from some HTML and replace it with 'whatever was inside the bold tags, but in italics instead'. The parentheses () in the regular expression mean 'capture this substring' as group 1, and then the $1 means 'replace with whatever was captured as group 1'.

Related

Java: is "$1" a placeholder? [duplicate]

This question already has answers here:
JAVA - replaceAll in a regex with $1
(1 answer)
What does RegExp.$1 do
(6 answers)
Closed 1 year ago.
I was given a Java exercise:
Break up camelCase writing into words, for example the input "camelCaseTest" should give the output "camel Case Test".
I found this solution online, but I don't understand all of it
public static String camelCaseBetter(String input) {
input = input.replaceAll("([A-Z])", " $1");
return input;
}
What does the $1 do? I think it just takes the String that is to be replaced (A-Z) and replaces it with itself (in this case the method also appends a space to break up the words)
I couldn't find a good explanation for $1, so I hope somebody here can explain it or share a link to the right resource which can explain it.

From the documentation of the String class:
Note that backslashes (\) and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string; see Matcher.replaceAll.
From Matcher.replaceAll
The replacement string may contain references to captured subsequences as in the appendReplacement method.
Then the appendReplacement method:
The replacement string may contain references to subsequences captured during the previous match: Each occurrence of ${name} or $g will be replaced by the result of evaluating the corresponding group(name) or group(g) respectively. For $g, the first number after the $ is always treated as part of the group reference. Subsequent numbers are incorporated into g if they would form a legal group reference. Only the numerals '0' through '9' are considered as potential components of the group reference. If the second group matched the string "foo", for example, then passing the replacement string "$2bar" would cause "foobar" to be appended to the string buffer. A dollar sign ($) may be included as a literal in the replacement string by preceding it with a backslash (\$).
So, $1 will reference the the first capturing group (whatever matches the pattern within the first parentheses of the regular expression).
([A-Z]) will match any uppercase character and place it in the first capturing group. $1 will then replace it with a space, followed by the matched uppercase character.

How to replace a substring of a string [duplicate]

This question already has answers here:
String replace method is not replacing characters
(5 answers)
Closed 6 years ago.
Assuming I have a String string like this:
"abcd=0; efgh=1"
and I want to replace "abcd" by "dddd". I have tried to do such thing:
string.replaceAll("abcd","dddd");
It does not work. Any suggestions?
EDIT:
To be more specific, I am working in Java and I am trying to parse the HTML document, concretely the content between <script> tags. I have already found a way how to parse this content into a string:
if(tag instanceof ScriptTag){
if(((ScriptTag) tag).getStringText().contains("DataVideo")){
String tagText = ((ScriptTag)tag).getStringText();
}
}
Now I have to find a way how to replace one substring by another one.

You need to use return value of replaceAll() method. replaceAll() does not replace the characters in the current string, it returns a new string with replacement.
String objects are immutable, their values cannot be changed after they are created.
You may use replace() instead of replaceAll() if you don't need regex.
String str = "abcd=0; efgh=1";
String replacedStr = str.replaceAll("abcd", "dddd");
System.out.println(str);
System.out.println(replacedStr);
outputs
abcd=0; efgh=1
dddd=0; efgh=1

2 things you should note:
Strings in Java are immutable to so you need to store return value of thereplace method call in another String.
You don't really need a regex here, just a simple call to String#replace(String) will do the job.
So just use this code:
String replaced = string.replace("abcd", "dddd");

You need to create the variable to assign the new value to, like this:
String str = string.replaceAll("abcd","dddd");

By regex i think this is java, the method replaceAll() returns a new String with the substrings replaced, so try this:
String teste = "abcd=0; efgh=1";
String teste2 = teste.replaceAll("abcd", "dddd");
System.out.println(teste2);
Output:
dddd=0; efgh=1

Note that backslashes (\) and dollar signs ($) in the replacement
string may cause the results to be different than if it were being
treated as a literal replacement string; see
Matcher.replaceAll.
Use
Matcher.quoteReplacement(java.lang.String)
to suppress the special meaning of these characters, if desired.
from javadoc.

You are probably not assigning it after doing the replacement or replacing the wrong thing.
Try :
String haystack = "abcd=0; efgh=1";
String result = haystack.replaceAll("abcd","dddd");

Replace all with a string having regex wild chars [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
java String.replaceAll without regex
I have a string and I need to replace some parts of it.
The replacement text contains regex wild chars though. Example:
String target = "Something * to do in ('AAA', 'BBB')";
String replacement = "Hello";
String originalText = "ABCDEFHGIJKLMN" + target + "ABCDEFHGIJKLMN";
System.out.println(originalText.replaceAll(target, replacement));
I get:
ABCDEFHGIJKLMNSomething * to do in ('AAA', 'BBB')ABCDEFHGIJKLMN
Why doesn't the replacement occur?

Because *, ( and ) are all meta-characters in regular expressions. Hence all of them need to be escaped. It looks like Java has a convenient method for this:
java.util.regex.Pattern.quote(target)
However, the better option might be, to just not use the regex-using replaceAll function but simply replace. Then you do not need to escape anything.

String.replaceAll() takes a regular expression and so it's trying to expand these metacharacters.
One approach is to escape these chars (e.g. \*).
Another would be to do the replacement yourself by using String.indexOf() and finding the start of the contained string. indexOf() doesn't take a regexp but rather a normal string.

replaceAll type method that preserves special characters

Currently, the replaceAll method of the String class, along with Matcher.replaceAll methods evaluate their arguments as regular expressions.
The problem I am having is that the replacement string I am passing to either of these methods contains a dollar sign (which of course has special meaning in a regular expression). An easy work-around to this would be to pass my replacement string to 'Matcher.quoteReplacement' as this produces a string with literal characters, and then pass this sanitized string to replaceAll.
Unfortunately, I can't do the above as I need to preserve the special characters as the resultant string is later used in operations where a reg ex is expected, and if I have escaped all the special characters this will break that contract.
Can someone please suggest a way I might achieve what I want to do? Many thanks.
EDIT: For clearer explanation, please find code example below:
String key = "USD";
String value = "$";
String content = "The figure is in USD";
String contentAfterReplacement;
contentAfterReplacement = content.replaceAll(key, value); //will throw an exception as it will evaluate the $ in 'value' variable as special regex character
contentAfterReplacement = content.replaceAll(key, Matcher.quoteReplacement(value)); //Can't do this as contentAfterReplacement is passed on and later parsed as a regex (Ie, it can't have special characters escaped).

Why not use String#replace method instead of replaceAll. replaceAll uses regex but replace doesn't use regex in replacement string.

Why the second argument is not being taken as regex?

I came across an interesting question on java regex
Is there a regular expression way to replace a set of characters with another set (like shell tr command)?
So I tried the following:
String a = "abc";
a = a.replaceAll("[a-z]", "[A-Z]");
Now if I get print a the output is
[A-Z][A-Z][A-Z]
Here I think the compiler is taking the first argument as gegex, but not the second argument.
So is there any problem with this code or something else is the reason???

This is the way replaceAll works.
See API:
public String replaceAll(String regex, String replacement)
Replaces each substring of this string that matches the given regular expression with the given replacement.

The answer to the linked question is a quite clear »No«, so this should come as no surprise.
As you can see from the documentation the second argument is indeed a regular string that is used as replacement:
Parameters:
regex – the regular expression to which this string is to be matched
replacement – the string to be substituted for each match

second argument is simple String that will get substituted according to API

If you want to turn lower case to upper case, there is a toUpperCase function available in String class. For equivalent functionality to tr utility, I think there is no support in Java (up to Java 7).
The replacement string is usually take literally, except for the sequence $n where n denotes the number of the capturing group in the regex. This will use captured string from the match as replacement.

I consider regex as a way to express a condition (i.e does a given string match this expression). With that in mind, what you are asking would mean "please replace what matches in my string with ... another condition" which doesn't make much sens.
Now by trying to understand what you are looking for, it ssems to me that you want to find some automatic mapping between classes of characters (e.g. [a-z] -> [A-Z]). As far as I know this does not exist and you would have to write it yourself (except for the forementionned toUpperCase())

public String replaceAll(String regex, String replacement)
First argument is regular expression if substring matches with that pattern that will be replaced by second argument ,if you want to convert to lowercase to upper case use
toUpperCase()
method

You should look into jtr. Example of usage:
String hello = "abccdefgdhcij";
CharacterReplacer characterReplacer;
try {
characterReplacer = new CharacterReplacer("a-j", "Helo, Wrd!");
hello = characterReplacer.doReplacement(hello);
} catch(CharacterParseException e) {
}
System.out.println(hello);
Output:
Hello, World!

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Special characters are interfering with Java method [duplicate] - java

Related

Java: is "$1" a placeholder? [duplicate]

How to replace a substring of a string [duplicate]

Replace all with a string having regex wild chars [duplicate]

replaceAll type method that preserves special characters

Why the second argument is not being taken as regex?

Categories

Resources