import java.util.StringTokenizer;
class MySplit
{
public static void main(String S[])
{
String settings = "12312$12121";
StringTokenizer splitedArray = new StringTokenizer(settings,"$");
String splitedArray1[] = settings.split("$");
System.out.println(splitedArray1[0]);
while(splitedArray.hasMoreElements())
System.out.println(splitedArray.nextToken().toString());
}
}
In above example if i am splitting string using $, then it is not working fine and if i am splitting with other symbol then it is working fine.
Why it is, if it support only regex expression then why it is working fine for :, ,, ; etc symbols.
$ has a special meaning in regex, and since String#split takes a regex as an argument, the $ is not interpreted as the string "$", but as the special meta character $. One sexy solution is:
settings.split(Pattern.quote("$"))
Pattern#quote:
Returns a literal pattern String for the specified String.
... The other solution would be escaping $, by adding \\:
settings.split("\\$")
Important note: It's extremely important to check that you actually got element(s) in the resulted array.
When you do splitedArray1[0], you could get ArrayIndexOutOfBoundsException if there's no $ symbol. I would add:
if (splitedArray1.length == 0) {
// return or do whatever you want
// except accessing the array
}
If you take a look at the Java docs you could see that the split method take a regex as parameter, so you have to write a regular expression not a simple character.
In regex $ has a specific meaning, so you have to escape it this way:
settings.split("\\$");
The problem is that the split(String str) method expects str to be a valid regular expression. The characters you have mentioned are special characters in regular expression syntax and thus perform a special operation.
To make the regular expression engine take them literally, you would need to escape them like so:
.split("\\$")
Thus given this:
String str = "This is 1st string.$This is the second string";
for(String string : str.split("\\$"))
System.out.println(string);
You end up with this:
This is 1st string.
This is the second strin
Dollar symbol $ is a special character in Java regex. You have to escape it so as to get it working like this:
settings.split("\\$");
From the String.split docs:
Splits this string around matches of the given regular expression.
This method works as if by invoking the two-argument split method with
the given expression and a limit argument of zero. Trailing empty
strings are therefore not included in the resulting array.
On a side note:
Have a look at the Pattern class which will give you an idea as to which all characters you need to escape.
Because $ is a special character used in Regular Expressions which indicate the beginning of an expression.
You should escape it using the escape sequence \$ and in case of Java it should be \$
Hope that helps.
Cheers
Related
How to validate the given string over the regular expression (XSD Pattern):
xsd pattern:'([a-zA-Z0-9.,;:'+-/()?*[]{}\`´~
]|[!"#%&<>÷=#_$£]|[àáâäçèéêëìíîïñòóôöùúûüýßÀÁÂÄÇÈÉÊËÌÍÎÏÒÓÔÖÙÚÛÜÑ])*'
I need to validate the string with above pattern whether it matches or not.
I have tried the below code but getting unsupported escape characters error while compiling
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class PatternMatching {
private static Pattern usrNamePtrn = Pattern.compile("([a-zA-Z0-9\.,;:'\+\-/\(\)?\*\[\]\{\}\\`´~ ]|[!"#%&<>÷=#_$£]|[àáâäçèéêëìíîïñòóôöùúûüýßÀÁÂÄÇÈÉÊËÌÍÎÏÒÓÔÖÙÚÛÜÑ])*");
public static boolean validateUserName(String userName){
Matcher mtch = usrNamePtrn.matcher(userName);
if(mtch.matches()){
return true;
}
return false;
}
public static void main(String a[]){
System.out.println("Is a valid username?"+validateUserName("stephen & john"));
}
}
how to do the above task, in addition to that if the doesn't match with the pattern then that characters need to be displayed.and I am using java 1.6 any suggestions is appreciated
First, the regular expression itself has three mistakes.
Mistake 1:
A backslash is a special character which is used to escape whatever character follows it. Therefore, the sequence
\`
is either identical to a single back-quote, or, depending on the regular expression engine, is an illegal escape sequence. Either way, if the intent was to match a backslash along with all the other characters, it should be written as:
\\`
Mistake 2:
Inside the […] character grouping, a ] must be escaped so it doesn’t signify the end of the grouping. So, [] needs to be written as [\].
Mistake 3:
Inside the […] character grouping, a - indicates a character range, like a-z. The regular expression [+-/] does not mean “plus or hyphen or slash”; it means “any of the characters between plus and slash, inclusive.” Technically, this mistake doesn’t affect the outcome in this particular case, because +-/ is equivalent to those three literal characters plus the comma and period, which both happen to occur earlier in the character grouping anyway. But, in the interest of saying what you mean, the - should be escaped:
+\-/
Second is the matter of turning the regular expression into a Java string.
The backslash and the double-quote are special characters in Java. Obviously, " denotes the start and end of a String literal, so if you want a " inside a String, you must escape it:
\"
This is not related to regular expressions; this just tells the compiler that the String contains a double-quote character. It will be compiled into a single " and that is what the regular expression engine will see.
Finally, there is the matter of backslashes. It just so happens that, while regular expressions use a backslash to escape characters as described above, Java also uses backslashes to escape characters in strings. This means that if you want a literal backslash in a Java String, it must be written in the code as two backslashes:
String s = "\\"; // a String of length 1
Recall from above that we need a regular expression with consecutive backslash characters:
\\`
A Java string containing those three characters would look like this:
String s = "\\\\`"; // a String of length 3
A regular expression allows a backslash almost anywhere; for instance, \% is the same as %. However, Java only allows specific characters to be preceded by a single backslash. \+ is not one of those permitted sequences.
+, (, ), {, and } are not special characters inside a […] grouping, so there is no need to escape them anyway.
So, your code needs to be changed from this:
private static Pattern usrNamePtrn = Pattern.compile("([a-zA-Z0-9\.,;:'\+\-/\(\)?\*\[\]\{\}\\`´~ ]|[!"#%&<>÷=#_$£]|[àáâäçèéêëìíîïñòóôöùúûüýßÀÁÂÄÇÈÉÊËÌÍÎÏÒÓÔÖÙÚÛÜÑ])*");
to this:
private static Pattern usrNamePtrn = Pattern.compile("([a-zA-Z0-9.,;:'+\\-/()?*\\[\\]{}\\\\`´~ ]|[!\"#%&<>÷=#_$£]|[àáâäçèéêëìíîïñòóôöùúûüýßÀÁÂÄÇÈÉÊËÌÍÎÏÒÓÔÖÙÚÛÜÑ])*");
This is because " is a special character in Java.
You'll have to substitute " with an escape character i.e. \" and \ with \\ as follows:
private static Pattern usrNamePtrn = Pattern.compile("([a-zA-Z0-9.,;:'+-/()?*[]{}\\`´~ ]|[!\"#%&<>÷=#_$£]|[àáâäçèéêëìíîïñòóôöùúûüýßÀÁÂÄÇÈÉÊËÌÍÎÏÒÓÔÖÙÚÛÜÑ])*");
Note the change in the pattern below where " and \ have been replaced by \" and \\:
Also, note that this will only fix the Compile Issues. You need to re-check your Regex to see if it works fine.
if (url.contains("|##|")) {
Log.e("url data", "" + url);
final String s[] = url.split("\\|##|");
}
I have a URL with the separator "|##|"
I tried to separate this but didn't find solution.
Use Pattern.quote, it'll do the work for you:
Returns a literal pattern String for the specified String.
final String s[] = url.split(Pattern.quote("|##|"));
Now "|##|" is treated as the string literal "|##|" and not the regex "|##|". The problem is that you're not escaping the second pipe, it has a special meaning in regex.
An alternative solution (as suggested by #kocko), is escaping* the special characters manually:
final String s[] = url.split("\\|##\\|");
* Escaping a special character is done by \, but in Java \ is represented as \\
You have to escape the second |, as it is a regex operator:
final String s[] = url.split("\\|##\\|");
You should try to understand the concept as well - String.split(String regex) interprets the parameter as a regular expression, and since pipe character "|" is a logical OR in regular expression, you would be getting result as an array of each alphabet is your word.
Even if you had used url.split("|"); you would have got same result.
Now why the String.contains(CharSequence s) passed the |##| in the start because it interprets the parameter as CharSequence and not a regular expression.
Bottom line: Check the API that how the particular method interprets the passed input. Like we have seen, in case of split() it interprets as regular expression while in case of contains() it interprets as character sequence.
You can check the regular expression constructs over here - http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html
I want to split a string when following of the symbols encounter "+,-,*,/,="
I am using split function but this function can take only one argument.Moreover it is not working on "+".
I am using following code:-
Stringname.split("Symbol");
Thanks.
String.split takes a regular expression as argument.
This means you can alternate whatever symbol or text abstraction in one parameter in order to split your String.
See documentation here.
Here's an example in your case:
String toSplit = "a+b-c*d/e=f";
String[] splitted = toSplit.split("[-+*/=]");
for (String split: splitted) {
System.out.println(split);
}
Output:
a
b
c
d
e
f
Notes:
Reserved characters for Patterns must be double-escaped with \\. Edit: Not needed here.
The [] brackets in the pattern indicate a character class.
More on Patterns here.
You can use a regular expression:
String[] tokens = input.split("[+*/=-]");
Note: - should be placed in first or last position to make sure it is not considered as a range separator.
You need Regular Expression. Addionaly you need the regex OR operator:
String[]tokens = Stringname.split("\\+|\\-|\\*|\\/|\\=");
For that, you need to use an appropriate regex statement. Most of the symbols you listed are reserved in regex, so you'll have to escape them with \.
A very baseline expression would be \+|\-|\\|\*|\=. Relatively easy to understand, each symbol you want is escaped with \, and each symbol is separated by the | (or) symbol. If, for example, you wanted to add ^ as well, all you would need to do is append |\^ to that statement.
For testing and quick expressions, I like to use www.regexpal.com
I have a question about using replaceAll() function.
if a string has parentheses as a pair, replace it with "",
while(S.contains("()"))
{
S = S.replaceAll("\\(\\)", "");
}
but why in replaceAll("\\(\\)", "");need to use \\(\\)?
Because as noted by the javadocs, the argument is a regular expression.
Parenthesis in a regular expression are used for grouping. If you're going to match parenthesis as part of a regular expression they must be escaped.
It's because replaceAll expects a regex and ( and ) have a special meaning in a regex expressions and need to be escaped.
An alternative is to use replace, which counter-intuitively does the same thing as replaceAll but takes a string as an input instead of a regex:
S = S.replace("()", "");
First, your code can be replaced with:
S = S.replace("()", "");
without the while loop.
Second, the first argument to .replaceAll() is a regular expression, and parens are special tokens in regular expressions (they are grouping operators).
And also, .replaceAll() replaces all occurrences, so you didn't even need the while loop here. Starting with Java 6 you could also have written:
S = S.replaceAll("\\Q()\\E", "");
It is let as an exercise to the reader as to what \Q and \E are: http://regularexpressions.info gives the answer ;)
S = S.replaceAll("\(\)", "") = the argument is a regular expression.
Because the method's first argument is a regex expression, and () are special characters in regex, so you need to escape them.
Because parentheses are special characters in regexps, so you need to escape them. To get a literal \ in a string in Java you need to escape it like so : \\.
So () => \(\) => \\(\\)
I want to split a String in Java on * using the split method. Here is the code:
String str = "abc*def";
String temp[] = str.split("*");
System.out.println(temp[0]);
But this program gives me the following error:
Exception in thread "main" java.util.regex.PatternSyntaxException:
Dangling meta character '*' near index 0 *
I tweaked the code a little bit, using '\\*' as the delimiter, which works perfectly. Can anyone explain this behavior (or suggest an alternate solution)?
I don't want to use StringTokenizer.
The split() method actually accepts a regular expression. The * character has special meaning within a regular expression, and cannot appear on its own. In order to tell the regular expression to use the actual * character, you need to escape it with the \ character.
Thus, your code becomes:
String str = "abc*def";
String temp[] = str.split("\\*");
System.out.println(temp[0]); // Prints "abc"
Note the \\: you need to escape the slash for Java as well.
If you want to avoid this issue in the future, please read up on regular expressions, so you'll have a good idea of both what types of expressions you can use, as well as which characters you'll need to escape.
String.split() expects a regular expression. In a regular expression * has a special meaning (0 or more of the character class before it) so it has to be escaped. \* accomplishes it. Since you are using a Java string \\ is the escape sequence for \ so your regular expression just becomes \* which behaves correctly.
Split accepts a regular expression to split on, not a string. Regular expressions have * as a reserved character, so you need to escape it with a backslash.
In java specifically, backslashes in strings are also special characters. They are used for newlines (\n), tabs (\t), and many other less common characters.
So because you are writing java, and writing a regex, you need to escape the * character twice. And thus '\*'.