How to search for a special character in java string? - java

I am having some problem with searching for a special character "(".
I got a java.util.regex.PatternSyntaxException exception has occurred.
It might have something to do with "(" being treated as special character.
I am not very good with pattern expression. Can someone help me properly search for the escape character?
// I need to split the string at the "("
String myString = "Room Temperature (C)";
String splitList[] = myString.split ("("); // i got an exception
// I tried this but got compile error
String splitList[] = myString.split ("\(");

Try one of these:
string.split("\\(");
string.split(Pattern.quote("("));
Since a string split takes a regular expression as an argument, you need to escape characters properly. See Jon Skeet's answer on this here:

The reason you got an exception the first time is because split() takes a regular expression as argument, and ( has a special meaning there, as you suggest. To avoid this, you need to escape it using a \, like you tried.
What you missed, is that you also need to escape your backslashes with an extra \ in Java, meaning you need a total of two:
String splitList[] = myString.split ("\\(");

You need to escape the character via backslashes: string.split("\\(");

( is one of regex special characters. To escape it you can use e.g.
split(Pattern.quote("(")),
split("\\Q(\\E"),
split("\\("),
split("[(]").

Related

Java: Replace all ' in a string with \'

I need to escape all quotes (') in a string, so it becomes \'
I've tried using replaceAll, but it doesn't do anything. For some reason I can't get the regex to work.
I'm trying with
String s = "You'll be totally awesome, I'm really terrible";
String shouldBecome = "You\'ll be totally awesome, I\'m really terrible";
s = s.replaceAll("'","\\'"); // Doesn't do anything
s = s.replaceAll("\'","\\'"); // Doesn't do anything
s = s.replaceAll("\\'","\\'"); // Doesn't do anything
I'm really stuck here, hope somebody can help me here.
Thanks,
Iwan
You have to first escape the backslash because it's a literal (yielding \\), and then escape it again because of the regular expression (yielding \\\\). So, Try:
s.replaceAll("'", "\\\\'");
output:
You\'ll be totally awesome, I\'m really terrible
Use replace()
s = s.replace("'", "\\'");
output:
You\'ll be totally awesome, I\'m really terrible
Let's take a tour of String#repalceAll(String regex, String replacement)
You will see that:
An invocation of this method of the form str.replaceAll(regex, repl) yields exactly the same result as the expression
Pattern.compile(regex).matcher(str).replaceAll(repl)
So lets take a look at Matcher.html#replaceAll(java.lang.String) documentation
Note that backslashes (\) and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string. Dollar signs may be treated as references to captured subsequences as described above, and backslashes are used to escape literal characters in the replacement string.
You can see that in replacement we have special character $ which can be used as reference to captured group like
System.out.println("aHellob,aWorldb".replaceAll("a(\\w+?)b", "$1"));
// result Hello,World
But sometimes we don't want $ to be such special because we want to use it as simple dollar character, so we need a way to escape it.
And here comes \, because since it is used to escape metacharacters in regex, Strings and probably in other places it is good convention to use it here to escape $.
So now \ is also metacharacter in replacing part, so if you want to make it simple \ literal in replacement you need to escape it somehow. And guess what? You escape it the same way as you escape it in regex or String. You just need to place another \ before one you escaping.
So if you want to create \ in replacement part you need to add another \ before it. But remember that to write \ literal in String you need to write it as "\\" so to create two \\ in replacement you need to write it as "\\\\".
So try
s = s.replaceAll("'", "\\\\'");
Or even better
to reduce explicit escaping in replacement part (and also in regex part - forgot to mentioned that earlier) just use replace instead replaceAll which adds regex escaping for us
s = s.replace("'", "\\'");
This doesn't say how to "fix" the problem - that's already been done in other answers; it exists to draw out the details and applicable documentation references.
When using String.replaceAll or any of the applicable Matcher replacers, pay attention to the replacement string and how it is handled:
Note that backslashes (\) and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string. Dollar signs may be treated as references to captured subsequences as described above, and backslashes are used to escape literal characters in the replacement string.
As pointed out by isnot2bad in a comment, Matcher.quoteReplacement may be useful here:
Returns a literal replacement String for the specified String. .. The String produced will match the sequence of characters in s treated as a literal sequence. Slashes (\) and dollar signs ($) will be given no special meaning.
You could also try using something like StringEscapeUtils to make your life even easier: http://commons.apache.org/proper/commons-lang/javadocs/api-2.6/org/apache/commons/lang/StringEscapeUtils.html
s = StringEscapeUtils.escapeJava(s);
You can use apache's commons-text library (instead of commons-lang):
Example code:
org.apache.commons.text.StringEscapeUtils.escapeJava(escapedString);
Dependency:
compile 'org.apache.commons:commons-text:1.8'
OR
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-text</artifactId>
<version>1.8</version>
</dependency>

"\\\\".replaceAll("\\\\", "\\") throws java.lang.StringIndexOutOfBoundsException

The following code fragment in Java:
"\\\\".replaceAll("\\\\", "\\");
throws the exception:
java.lang.StringIndexOutOfBoundsException: String index out of range: 1 (NO_SOURCE_FILE:0)
The javadoc on replaceAll does include a caveat on the use of backslashes and recommends using Matcher.replaceAll or Matcher.quoteReplacement. Does anybody have a snippet on how to replace all occurrences of two backslashes in a string with a single backslash ?
clarification
The actual literal shown above is only an example, the actually string can have many occurrences of two consecutive backslashes in different places.
You can simply do it with String#replace(): -
"\\\\".replace("\\\\", "\\")
String#replaceAll takes a regex as parameter. So, you would have to escape the backslash twice. Once for Java and then for Regex. So, the actual replacement using replaceAll would look like: -
"\\\\".replaceAll("\\\\\\\\", "\\\\")
But you don't really need a replaceAll here.
Try this instead:
"\\\\".replaceAll("\\{2}", "\\")
The first parameter to replaceAll() is a regular expression, and {2} indicates that exactly two occurrences of the char must be matched.
If you want to use Matcher.replaeAll() then you want something like this:
Pattern.compile("\\\\\\\\").matcher(input).replaceAll("\\\\");
If you have backslash in replacement string, it will be treated as escape character and the method will try to read the next character.That is why , it is throwing StringIndexOutOfBoundsException.

Regex to Match: !$%^&*()_+|~-=`{}[]:";'<>?,./

I know this a duplicate of a question with almost the identical name, however, I can't get it to work in Android what so ever!
I am trying this: Regex to Match Symbols:
public Pattern bsymbols = Pattern.compile("[-!$%^&*()_+|~=`{}\\[\\]:\\";'<>?,.\/]");
However, this isn't working. Does anyone know the correct method of applying this pattern?
P.S. Complete noob at Regex. :D
From here originally - Regex to Match Symbols: !$%^&*()_+|~-=`{}[]:";'<>?,./
Error Message: Syntax error on token(s), misplaced construct(s)
UPDATE: Added extra backslashes...fixed a lot of em, now gets error from ; onwards. Using Eclipse.
I think your problem is the "
public Pattern bsymbols = Pattern.compile("[-!$%^&*()_+|~=`{}\[\]:";'<>?,.\/]");
^
it is ending your string, so you should escape it. Also you need to remove the backslash before the slash, it is no special character.
public Pattern bsymbols = Pattern.compile("[-!$%^&*()_+|~=`{}\\[\\]:\";'<>?,./]");
OK, once more, you wanted to match the backslash, not to escape the slash, then we end up here:
public Pattern bsymbols = Pattern.compile("[-!$%^&*()_+|~=`{}\\[\\]:\";'<>?,.\\\\/]");
now it is the same answer, than jdb's, so +1 to him for being quicker.
How about that?
Pattern.compile("[-!$%^&*()_+|~=`{}\\[\\]:\";'<>?,.\\\\/]");
In a character class, only [ and ] have special meaning, so you need to escape them. Plus in Java, you need to escape with an extra backslash. That's the problem specifically with Java. So, you need to use \\[ and \\]. And yes, you need to escape " with single backslash, in a string literal.
Apart from that, a hyphen when used somewhere in the middle, has also a special meaning. If you want to match a hyphen, you need to use it at the ends.
Rest of the characters, don't need to be escaped. They are just ordinary characters.
So, your pattern should be like this: -
Pattern bsymbols = Pattern.compile("[-!$%^&*()_+|~=`{}\\[\\]:\";'<>?,./]");
And if you want to match backslash (\) also, then use this: -
Pattern bsymbols = Pattern.compile("[-!$%^&*()_+|~=`{}\\[\\]:\";'<>?,.\\\\/]");

Java split on ^ (caret?) not working, is this a special character?

In Java, I am trying to split on the ^ character, but it is failing to recognize it. Escaping \^ throws code error.
Is this a special character or do I need to do something else to get it to recognize it?
String splitChr = "^";
String[] fmgStrng = aryToSplit.split(splitChr);
The ^ is a special character in Java regex - it means "match the beginning" of an input.
You will need to escape it with "\\^". The double slash is needed to escape the \, otherwise Java's compiler will think you're attempting to use a special \^ sequence in a string, similar to \n for newlines.
\^ is not a special escape sequence though, so you will get compiler errors.
In short, use "\\^".
The ^ matches the start of string. You need to escape it, but in this case you need to escape it so that the regular expression parser understands which means escaping the escape, so:
String splitChr = "\\^";
...
should get you what you want.
String.split() accepts a regex. The ^ sign is a special symbol denoting the beginning of the input sequence. You need to escape it to make it work. You were right trying to escape it with \ but it's a special character to escape things in Java strings so you need to escape the escape character with another \. It will give you:
\\^
use "\\^". Use this example as a guide:
String aryToSplit = "word1^word2";
String splitChr = "\\^";
String[] fmgStrng = aryToSplit.split(splitChr);
System.out.println(fmgStrng[0]+","+fmgStrng[1]);
It should print "word1,word2", effectively splitting the string using "\\^". The first slash is used to escape the second slash. If there were no double slash, Java would think ^ was an escape character, like the newline "\n"
None of the above answers makes no sense. Here is the right explanation.
As we all know, ^ doesn't need to be escaped in Java String.
As ^ is special charectar in RegulalExpression , it expects you to pass in \^
How do we make string \^ in java? Like this String splitstr = "\\^";

How to replace a special character with single slash

I have a question about strings in Java. Let's say, I have a string like so:
String str = "The . startup trace ?state is info?";
As the string contains the special character like "?" I need the string to be replaced with "\?" as per my requirement. How do I replace special characters with "\"? I tried the following way.
str.replace("?","\?");
But it gives a compilation error. Then I tried the following:
str.replace("?","\\?");
When I do this it replaces the special characters with "\\". But when I print the string, it prints with single slash. I thought it is taking single slash only but when I debugged I found that the variable is taking "\\".
Can anyone suggest how to replace the special characters with single slash ("\")?
On escape sequences
A declaration like:
String s = "\\";
defines a string containing a single backslash. That is, s.length() == 1.
This is because \ is a Java escape character for String and char literals. Here are some other examples:
"\n" is a String of length 1 containing the newline character
"\t" is a String of length 1 containing the tab character
"\"" is a String of length 1 containing the double quote character
"\/" contains an invalid escape sequence, and therefore is not a valid String literal
it causes compilation error
Naturally you can combine escape sequences with normal unescaped characters in a String literal:
System.out.println("\"Hey\\\nHow\tare you?");
The above prints (tab spacing may vary):
"Hey\
How are you?
References
JLS 3.10.6 Escape Sequences for Character and String Literals
See also
Is the char literal '\"' the same as '"' ?(backslash-doublequote vs only-doublequote)
Back to the problem
Your problem definition is very vague, but the following snippet works as it should:
System.out.println("How are you? Really??? Awesome!".replace("?", "\\?"));
The above snippet replaces ? with \?, and thus prints:
How are you\? Really\?\?\? Awesome!
If instead you want to replace a char with another char, then there's also an overload for that:
System.out.println("How are you? Really??? Awesome!".replace('?', '\\'));
The above snippet replaces ? with \, and thus prints:
How are you\ Really\\\ Awesome!
String API links
replace(CharSequence target, CharSequence replacement)
Replaces each substring of this string that matches the literal target sequence with the specified literal replacement sequence.
replace(char oldChar, char newChar)
Returns a new string resulting from replacing all occurrences of oldChar in this string with newChar.
On how regex complicates things
If you're using replaceAll or any other regex-based methods, then things becomes somewhat more complicated. It can be greatly simplified if you understand some basic rules.
Regex patterns in Java is given as String values
Metacharacters (such as ? and .) have special meanings, and may need to be escaped by preceding with a backslash to be matched literally
The backslash is also a special character in replacement String values
The above factors can lead to the need for numerous backslashes in patterns and replacement strings in a Java source code.
It doesn't look like you need regex for this problem, but here's a simple example to show what it can do:
System.out.println(
"Who you gonna call? GHOSTBUSTERS!!!"
.replaceAll("[?!]+", "<$0>")
);
The above prints:
Who you gonna call<?> GHOSTBUSTERS<!!!>
The pattern [?!]+ matches one-or-more (+) of any characters in the character class [...] definition (which contains a ? and ! in this case). The replacement string <$0> essentially puts the entire match $0 within angled brackets.
Related questions
Having trouble with Splitting text. - discusses common mistakes like split(".") and split("|")
Regular expressions references
regular-expressions.info
Character class and Repetition with Star and Plus
java.util.regex.Pattern and Matcher
In case you want to replace ? with \?, there are 2 possibilities: replace and replaceAll (for regular expressions):
str.replace("?", "\\?")
str.replaceAll("\\?","\\\\?");
The result is "The . startup trace \?state is info\?"
If you want to replace ? with \, just remove the ? character from the second argument.
But when I print the string, it prints
with single slash.
Good. That's exactly what you want, isn't it?
There are two simple rules:
A backslash inside a String literal has to be specified as two to satisfy the compiler, i.e. "\". Otherwise it is taken as a special-character escape.
A backslash in a regular expresion has to be specified as two to satisfy regex, otherwise it is taken as a regex escape. Because of (1) this means you have to write 2x2=4 of them:"\\\\" (and because of the forum software I actually had to write 8!).
String str="\\";
str=str.replace(str,"\\\\");
System.out.println("New String="+str);
Out put:- New String=\
In java "\\" treat as "\". So, the above code replace a "\" single slash into "\\".

Categories