Splitting a String in Java - java

I have a string as follows
String s = "3|4||5 9|4 0|0 4 8|..."
and I want to split it based on the "|" appearances. As such, the split should return
["3","4","5 9","4 0,"0 4 8",...]
But, in Java,
s.split("|") = [, 3, |, 4, ...]
In other words, it is splitting by the "" character, it seems. What is wrong?

The | character has special meaning in regular expressions, so you must escape it with a backslash. Then you must escape the backslash itself for Java. Try:
s.split("\\|")
The Javadocs for the Pattern class has lots of details about special characters in regular expressions. See the "Logical operators" section in that page for what | does.

Note that public String[] split(String regex) takes a regex.
Since | is a meta character,It works when you escape the special character.
String[] results = result.split("\\|");
or(personally recommending this)
String[] result = result.split(Pattern.quote("|"));
If you use Pattern
Now, | will be treated as normal character | and not as the regex meta char |.
Oracle explained here why \\

You can try like below
s.split("[|]")

Related

Java String splitting difference between "|" and "\|" [duplicate]

I have a String = "Hello-new-World". And when i use the split() method with different regex values, it acts differently.
String str = "Hello-new-world"
String[] strbuf=str.split("-");
for(int i=0;i<strbuf.length;i++)
System.out.print(strbuf[i]+" ");
The output i get is :
hello
new
world
whereas if i change my string to "Hello|new|world", i get an altogether different answer. The new output becomes:
h
e
l
l
o
|
n
e
w
|
w
o
r
l
d
Can someone please explain what could be the possible reason for this.
Presumably you're splitting on "|" in the second case - and | has a special meaning within regular expressions. If you want to split on the actual pipe character, you should escape it:
String[] bits = whole.split(Pattern.quote("|"));
split method takes regular expression as input. The pipe is a special character for regular expression, so if you want to use it tou need to escape the special character. Ther are multiple solutions:
You need to escape the "pipe" character
str.split("\\|");
Or you can use the helper quote:
str.split(Regexp.quote("|"))
Or between sqares:
str.split("[|]");
Pipe is special regex symbol which means OR, if you want to split by pipe then escape it in your regex:
String[] strbuf = str.split("\\|");
OR
String[] strbuf = str.split("[|]");
str.split("|");
means something different. String#split uses regex, and | is a metacharacter, so that string means: split off of the empty string or off of the empty string. That is why your string gets split on every character.
There are a few ways of doing what you expect (use these as the string to split off of):
"\\|"
Which means to escape the metacharacter.
"[|]"
Puts the metacharacter in a character class.
"\\Q|\\E"
Puts the metacharacter in a quote
An unescaped | is parsed as a regex meaning "empty string or empty string," so use
str.split("\\|");
| having special meaning OR in regex
The pipe has different meaning in regular expression, so if you want to use it you need to escape the special character.
str.split("\\|");
It is a metacharacter. Escape it with a backslash, like this: "\\|"

Java split String at | [duplicate]

This question already has answers here:
Splitting a String with the pipe symbol as delimiter
(6 answers)
Closed 7 years ago.
How can I split my String after this character: |
If i simply write:
String[] parts = match.split("|");
the String is split after every single Character.
Please, escape the character:
String[] parts = match.split("\\|");
Use:
String[] parts = match.split("\\|");
The pipe symbol is a special character for regular expressions; you need to escape it with a backslash if you want to use the literal pipe symbol character. And because the backslash is a special character in Java strings, you need to escape that too with another backslash. Hence, the double backslash before the pipe symbol.
String.split() receives regular expression where | has special meaning. If you want to split by | you have to escape it using back slash \:
String.split("\\|")
The double back slash is needed here to escape the back slash from the point of view of java, and then escape the | from the point of view of regex.
The recommended method is to use:
String[] parts = match.split(Pattern.quote("|"));
this
public void test() {
String match = "A|B";
String[] parts = match.split(Pattern.quote("|"));
System.out.println(Arrays.toString(parts));
}
prints
[A, B]
You need to escape this character. So you can use this :-
String[] parts = match.split("\\|");
So if match is not containing | , the parts will contain only single element , ie match, else the splitted match

Java Error in String Split

I want my program to accept search strings, for example:
blue & berry (to find both of the words)
bed | sleep | pillow(to find first one or the second one etc)
When i recieve these string into my program, i use
String.split() With "&" or "|" as separator.
String[] splited = input.split("|");
It works fine in the first case, but in the second case in separates each letter in the words, for example: b e d. Can i do something for it to be separated by words with this symbol, not just splited letter by letter?
split() is taking a regexp as argument. | means or, so you are splitting on "empty string or empty string", so it's splitting after every letter. If you want to split on "|" symbol, you have to escape it:
String[] splited = input.split("\\|");
String.split(String) uses regular expressions and | is a special character in regex. Use \| to refer to a literal |, and \\ to escape the \ for Java. Resulting in \\|.

RegEx special char "|" escaping in Java

I am trying to split a string like: abc|aa||
When I use the regular string.split I am required to provide a regular expression.
I tried to do the following :
string.split("|")
string.split("\|")
string.split("/|")
string.split("\Q|\E")
Non of them work.....
Does anyone know how to make it work?
I don't know how you tried, but
public static void main(String[] args) {
String a= "abc|aa||";
String split = Pattern.quote("|");
System.out.println(split);
System.out.println(Arrays.toString(a.split(split)));
}
prints out
\Q|\E
[abc, aa]
effectively splitting on |. The \Q ... \E is a regex quote. Anything inside it will be matched as a literal pattern.
string.split("\|"); // won't work because \| is not a valid escape sequence
string.split("/|"); // will compile, but split on / and empty space, so between each character
string.split("|"); // will compile, but split on empty space, so between each character
// true alternative to quoted solution above
string.split("\\|") // escape the second \ which will resolve as an escaped | in the regex pattern
using a double backslash is required because the backslash is also a special character. So you need to escape the escape character. i.e. \
\|
| is a special character hence you need to escape it using slashes. Try using
string.split("\\|")
| is a special character for the regular expression, thus it must be escaped e.g. \|
The backslash \ is a special character in Java, thus it must also be escaped
As a result, must do the following to achieve the desired effect.
string.split("\\|")
All of the following patterns split it all right: "\\Q|\\E" "\\|" "[|]" of course the latter two are preferrable

Java regular expression

I want to replace any one of these chars:
% \ , [ ] # & # ! ^
... with empty string ("").
I used this code:
String line = "[ybi-173]";
Pattern cleanPattern = Pattern.compile("%|\\|,|[|]|#|&|#|!|^");
Matcher matcher = cleanPattern.matcher(line);
line = matcher.replaceAll("");
But it doesn't work.
What do I miss in this regular expression?
Some of the characters are special characters that are being interpreted differently. You can either escape them all with backslashes, or better yet put them in a character class (no need to escape the non-CC characters, eases readability):
Pattern cleanPattern = Pattern.compile("[%\\\\,\\[\\]#&#!^]");
There are several reasons why your solution doesn't work.
Several of the characters you wish to match have special meanings in regular expressions, including ^, [, and ]. These must be escaped with a \ character, but, to make matters worse, the \ itself must be escaped so that the Java compiler will pass the \ through to the regular expression constructor. So, to sum up step one, if you wish to match a ] character, the Java string must look like "\\]".
But, furthermore, this is a case for character classes [], rather than the alternation operator |. If you want to match "any of the characters a, b, c, that looks like [abc]. You character class would be [%\,[]#&#!^], but, because of the Java string escaping rules and the special meaning of certain characters, your regex will be [%\\\\,\\[\\]#&#!\\^].
You'd define your pattern as a character group enclosed in [ and ] and escape special chars, e.g.
String n = "%\\,[]#&#!^".replaceAll("[%\\\\,\\[\\]#&#!^]", "");

Categories