This question already has answers here:
Splitting a Java String by the pipe symbol using split("|")
(7 answers)
Closed 7 years ago.
I have a file with content
1|yes|
2|yes|
3|yes|
4|yes|
5|yes|
6|yes|
7|yes|
8|yes|
9|yes|
10|yes|
11|yes|
12|yes|
13|yes|
14|yes|
15|yes|
I use java's String[] tokens = split("|"); to split each line, but it returns (for example splitting "10|yes|") [1,0,|,y,e,s,|]. It seems instead of splitting by "|", it splits every character. Anyone has any idea on it? Thanks!
split accepts a regular expression. | has a specific meaning in regular expressions, it expresses an alternation. To actually split on |, you have to escape it in the regex with a backslash. Since you specify the regex using a string literal, and backslashes are special in string literals, you have to escape that with another backslash:
String[] tokens = str.split("\\|");
In the general case, if you want to use the contents of a string literally, you can use Pattern.quote to automatically escape any special characters. You don't really need it here, but it's useful for end-user-entered values:
String[] tokens = str.split(Pattern.quote(stringToSplitOnLiterally));
Related
This question already has answers here:
How do I express ":" but not preceded by "\" in a Java regular expression?
(2 answers)
Closed 4 years ago.
I need to split a string in below condition.
Split with / and should not split if it has \/.
Split with = and should not split if it has \=.
Basically looking for TWO regular expressions which split with above conditions and avoid if it has escape character.
You may try using lookarounds here:
String input = "Hello/World";
String[] parts = input.split("(?<!\\\\)[/=]");
The above single regex covers both splitting cases. It uses a negative lookbehind, which asserts that the character which immediately precedes the / or = is not a backslash.
You could use a negative lookbehind (?<!\\\\) to assert what is on the left is not a backslash.
Then match 1+ times a forward slash or an equals sign [/=]+ using a character class:
String regex = "(?<!\\\\)[/=]+";
Java demo | Regex demo
This question already has answers here:
Why does String.split need pipe delimiter to be escaped?
(3 answers)
Closed 8 years ago.
I am trying to split a string as follows
String string = "mike|ricki"
If I do the following string.split("|") I would expect an array of 2 elements, "mike" and "ricki". Instead I am getting the following
[, m, i, k, e, |, r, i, c, k, i]
Am i doing something fundamentally wrong here?
Yes. Pipe character | is a special character in regular expressions. You must escape it by using \. The escape string would be \|, but in Java the backslash \ is a special character for escape in literal Strings, so you have to double escape it and use \\|:
String[] names = string.split("\\|");
System.out.println(Arrays.toString(names));
If you read the String.split() Java Documentation, it says that it can receive a Regular Expression as an input.
The Pipe character | is a special character in regular expressions so if you want to use it as a literal you have to escape it like \\|
So your code have to be:
String[] splitted = string.split("\\|");
EDIT : Corrected sample code.
String.split takes a regular expression. The pipe character has a special meaning in regex so it's not matching as you were expecting.
Try String.split("\\|") instead.
The backslash tells regex to treat the pipe as a literal character.
This question already has answers here:
How to split a comma separated String while ignoring escaped commas?
(6 answers)
Closed 9 years ago.
I'm looking for a regular expression to match , but ignore \, in Java's regex engine. This comes close:
[^\\],
However, it matches the previous character (in addition to the comma), which won't work.
Perhaps the regular expression approach is the wrong one altogether. I was intending to use String.split() to parse a simple CSV file (can't use an external library) with escaped commas.
You need a negative look-behind assertion here:
String[] arr = str.split("(?<![^\\\\]\\\\),");
Note that you need 4 backslashes there. First escape the backslash for Java string literal. And then again escape both the backslashes for regex.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
java String.replaceAll without regex
I have a string and I need to replace some parts of it.
The replacement text contains regex wild chars though. Example:
String target = "Something * to do in ('AAA', 'BBB')";
String replacement = "Hello";
String originalText = "ABCDEFHGIJKLMN" + target + "ABCDEFHGIJKLMN";
System.out.println(originalText.replaceAll(target, replacement));
I get:
ABCDEFHGIJKLMNSomething * to do in ('AAA', 'BBB')ABCDEFHGIJKLMN
Why doesn't the replacement occur?
Because *, ( and ) are all meta-characters in regular expressions. Hence all of them need to be escaped. It looks like Java has a convenient method for this:
java.util.regex.Pattern.quote(target)
However, the better option might be, to just not use the regex-using replaceAll function but simply replace. Then you do not need to escape anything.
String.replaceAll() takes a regular expression and so it's trying to expand these metacharacters.
One approach is to escape these chars (e.g. \*).
Another would be to do the replacement yourself by using String.indexOf() and finding the start of the contained string. indexOf() doesn't take a regexp but rather a normal string.
This question already has answers here:
How do I split a string in Java?
(39 answers)
Closed 7 years ago.
I would like to parse entire file based on all the possible delimiters like commas, colon, semi colons, periods, spaces, hiphens etcs.
Suppose I have a hypothetical string line "Hi,X How-how are:any you?" I should get output array with items Hi,X,How,how,are,any and you.
How do I specify all these delimiter in String.split method?
Thanks in advance.
String.split takes a regular expression, in this case, you want non-word characters (regex \W) to be the split, so it's simply:
String input = "Hi,X How-how are:any you?";
String[] parts = input.split("[\\W]");
If you wanted to be more explicit, you could use the exact characters in the expression:
String[] parts = input.split("[,\\s\\-:\\?]");