Java String splitting difference between "|" and "\|" [duplicate] - java

I have a String = "Hello-new-World". And when i use the split() method with different regex values, it acts differently.
String str = "Hello-new-world"
String[] strbuf=str.split("-");
for(int i=0;i<strbuf.length;i++)
System.out.print(strbuf[i]+" ");
The output i get is :
hello
new
world
whereas if i change my string to "Hello|new|world", i get an altogether different answer. The new output becomes:
h
e
l
l
o
|
n
e
w
|
w
o
r
l
d
Can someone please explain what could be the possible reason for this.

Presumably you're splitting on "|" in the second case - and | has a special meaning within regular expressions. If you want to split on the actual pipe character, you should escape it:
String[] bits = whole.split(Pattern.quote("|"));

split method takes regular expression as input. The pipe is a special character for regular expression, so if you want to use it tou need to escape the special character. Ther are multiple solutions:
You need to escape the "pipe" character
str.split("\\|");
Or you can use the helper quote:
str.split(Regexp.quote("|"))
Or between sqares:
str.split("[|]");

Pipe is special regex symbol which means OR, if you want to split by pipe then escape it in your regex:
String[] strbuf = str.split("\\|");
OR
String[] strbuf = str.split("[|]");

str.split("|");
means something different. String#split uses regex, and | is a metacharacter, so that string means: split off of the empty string or off of the empty string. That is why your string gets split on every character.
There are a few ways of doing what you expect (use these as the string to split off of):
"\\|"
Which means to escape the metacharacter.
"[|]"
Puts the metacharacter in a character class.
"\\Q|\\E"
Puts the metacharacter in a quote

An unescaped | is parsed as a regex meaning "empty string or empty string," so use
str.split("\\|");
| having special meaning OR in regex

The pipe has different meaning in regular expression, so if you want to use it you need to escape the special character.
str.split("\\|");

It is a metacharacter. Escape it with a backslash, like this: "\\|"

Related

Java split String at | [duplicate]

This question already has answers here:
Splitting a String with the pipe symbol as delimiter
(6 answers)
Closed 7 years ago.
How can I split my String after this character: |
If i simply write:
String[] parts = match.split("|");
the String is split after every single Character.
Please, escape the character:
String[] parts = match.split("\\|");
Use:
String[] parts = match.split("\\|");
The pipe symbol is a special character for regular expressions; you need to escape it with a backslash if you want to use the literal pipe symbol character. And because the backslash is a special character in Java strings, you need to escape that too with another backslash. Hence, the double backslash before the pipe symbol.
String.split() receives regular expression where | has special meaning. If you want to split by | you have to escape it using back slash \:
String.split("\\|")
The double back slash is needed here to escape the back slash from the point of view of java, and then escape the | from the point of view of regex.
The recommended method is to use:
String[] parts = match.split(Pattern.quote("|"));
this
public void test() {
String match = "A|B";
String[] parts = match.split(Pattern.quote("|"));
System.out.println(Arrays.toString(parts));
}
prints
[A, B]
You need to escape this character. So you can use this :-
String[] parts = match.split("\\|");
So if match is not containing | , the parts will contain only single element , ie match, else the splitted match

String split by dot - Java

I have the following code:
public static void main(String[] args) {
String str = "21.12.2015";
String delim = "\\.";
String[] st = str.split(delim);
System.out.println(st[0]+"."+st[1]+"."+st[2]); // 1
System.out.println(st[0]+delim+st[1]+delim+st[2]); // 2
}
Now, line 1 is printing expected output - 21.12.2015. But why line 2 is not giving same output as line 1? Why it is printing like 21\.12\.2015?
EDIT:
Actually in my requirement, the delimiter changes dynamically for each string(- or / or .). So I am trying to assign the delimiter to a variable and then split by it and finally print it as a pattern(say dd.mm.yy or dd-mm-yy or etc). For other delimiters it's fine, but for dot it's coming like dd\.mm\.yy. How shall I achieve the expected result?
This handles all delim values:
String str = "21.12.2015";
String delim = "."; // or "-" or "?" or ...
String[] st = str.split(java.util.regex.Pattern.quote(delim));
When you say split you are using delim as a regex pattern. It is treated differently. Please have a look to this regular expression.
But when you are using delim in sysout you are using it as string. the difference is obviuos
When you create the delim variable, you escape the backslash. The real value of the delim variable is \..
Just create the delim variable as (the backslash is useless):
String delim = ".";
because of delim = "\\.", while spliting "\\." is required.
You are using the split method from the String class, which uses regular expression for splitting the the string.
Due to this the \\. will split the string by every dot and needs to be escaped, since the dot itself is part of the regular expression.
In the second part you are simply printing the string, in which the backlash itself is a indicator for an string expression (like \n as a new line).
The double backlash just excludes this string expression to be written as a normal string "\n" in this case, and thats why you get the "\." result
For better understanding, try to delete one of the backslashes in the delim variable, and the java interpreter will throw an error since "\." is not a string expression
\\. is a regex String to parse . literally. You need it while splitting (since split() expects a regex String).
While printing, you need to use . directly isntead of "\\." because println() doesn't need a regex.
Split method uses regex for splitting so you will need to provide as \\. while this is not the scenario when you are printing it, you just need to use '.' directly.
In Java \\. will be printed as \. as \\ is considered as a single backslash.

Java Error in String Split

I want my program to accept search strings, for example:
blue & berry (to find both of the words)
bed | sleep | pillow(to find first one or the second one etc)
When i recieve these string into my program, i use
String.split() With "&" or "|" as separator.
String[] splited = input.split("|");
It works fine in the first case, but in the second case in separates each letter in the words, for example: b e d. Can i do something for it to be separated by words with this symbol, not just splited letter by letter?
split() is taking a regexp as argument. | means or, so you are splitting on "empty string or empty string", so it's splitting after every letter. If you want to split on "|" symbol, you have to escape it:
String[] splited = input.split("\\|");
String.split(String) uses regular expressions and | is a special character in regex. Use \| to refer to a literal |, and \\ to escape the \ for Java. Resulting in \\|.

Splitting a String in Java

I have a string as follows
String s = "3|4||5 9|4 0|0 4 8|..."
and I want to split it based on the "|" appearances. As such, the split should return
["3","4","5 9","4 0,"0 4 8",...]
But, in Java,
s.split("|") = [, 3, |, 4, ...]
In other words, it is splitting by the "" character, it seems. What is wrong?
The | character has special meaning in regular expressions, so you must escape it with a backslash. Then you must escape the backslash itself for Java. Try:
s.split("\\|")
The Javadocs for the Pattern class has lots of details about special characters in regular expressions. See the "Logical operators" section in that page for what | does.
Note that public String[] split(String regex) takes a regex.
Since | is a meta character,It works when you escape the special character.
String[] results = result.split("\\|");
or(personally recommending this)
String[] result = result.split(Pattern.quote("|"));
If you use Pattern
Now, | will be treated as normal character | and not as the regex meta char |.
Oracle explained here why \\
You can try like below
s.split("[|]")

Splitting a string in java on more than one symbol

I want to split a string when following of the symbols encounter "+,-,*,/,="
I am using split function but this function can take only one argument.Moreover it is not working on "+".
I am using following code:-
Stringname.split("Symbol");
Thanks.
String.split takes a regular expression as argument.
This means you can alternate whatever symbol or text abstraction in one parameter in order to split your String.
See documentation here.
Here's an example in your case:
String toSplit = "a+b-c*d/e=f";
String[] splitted = toSplit.split("[-+*/=]");
for (String split: splitted) {
System.out.println(split);
}
Output:
a
b
c
d
e
f
Notes:
Reserved characters for Patterns must be double-escaped with \\. Edit: Not needed here.
The [] brackets in the pattern indicate a character class.
More on Patterns here.
You can use a regular expression:
String[] tokens = input.split("[+*/=-]");
Note: - should be placed in first or last position to make sure it is not considered as a range separator.
You need Regular Expression. Addionaly you need the regex OR operator:
String[]tokens = Stringname.split("\\+|\\-|\\*|\\/|\\=");
For that, you need to use an appropriate regex statement. Most of the symbols you listed are reserved in regex, so you'll have to escape them with \.
A very baseline expression would be \+|\-|\\|\*|\=. Relatively easy to understand, each symbol you want is escaped with \, and each symbol is separated by the | (or) symbol. If, for example, you wanted to add ^ as well, all you would need to do is append |\^ to that statement.
For testing and quick expressions, I like to use www.regexpal.com

Categories