This question already has answers here:
How to split a comma separated String while ignoring escaped commas?
(6 answers)
Closed 9 years ago.
I'm looking for a regular expression to match , but ignore \, in Java's regex engine. This comes close:
[^\\],
However, it matches the previous character (in addition to the comma), which won't work.
Perhaps the regular expression approach is the wrong one altogether. I was intending to use String.split() to parse a simple CSV file (can't use an external library) with escaped commas.
You need a negative look-behind assertion here:
String[] arr = str.split("(?<![^\\\\]\\\\),");
Note that you need 4 backslashes there. First escape the backslash for Java string literal. And then again escape both the backslashes for regex.
Related
This question already has answers here:
My regex is matching too much. How do I make it stop? [duplicate]
(5 answers)
Closed 4 years ago.
I want to replace &sp; in the string below with Z.
Input text : ABCD&sp;EF&p;GHIJ&bsp;KL
Output text : ABCDZEFZGHIZKL
Can anyone tell me how to replace the every instance of &\D+; using java regular expression?
I am using /(&\D+;)?/ but it doesn't work.
Use String#replaceAll.
You also should use the ? modificator to +:
String str = "ABCD&sp;EF&p;GHIJ&bsp;KL";
String regex = "&\\D+?;";
System.out.println (str.replaceAll(regex,"Z"));
This should work
Match the initial &, then all characters that are not the tailing ;, then that tailing ; like so: &[^;]+; If not matching numbers (as suggested by your example with \D) is a requirement, add the numbers to the negated character set: [^;0-9] To make it replace all occurrences, add the global flag g. The site regexr.com is a handy tool to create regexes.
Edit: Sorry, I initially read your question wrong.
This question already has answers here:
Version difference? Regex Escape in Java
(2 answers)
Closed 1 year ago.
Are escape sequences and whitespace characters the same thing? I'm not sure what else to write here but Stackoverflow said the first sentence is not enough so I'm typing this second sentence for no reason at all but that so this post will go through.
There are a few escape sequences specified in Java, of which \s is not part. The \s is recognized as whitespace in regular expressions, where it is a predefined character class.
Check the following sections from the Java Tutorial:
Escape Sequences
Predefined Character Classes
This question already has answers here:
Splitting a Java String by the pipe symbol using split("|")
(7 answers)
Closed 7 years ago.
I have a file with content
1|yes|
2|yes|
3|yes|
4|yes|
5|yes|
6|yes|
7|yes|
8|yes|
9|yes|
10|yes|
11|yes|
12|yes|
13|yes|
14|yes|
15|yes|
I use java's String[] tokens = split("|"); to split each line, but it returns (for example splitting "10|yes|") [1,0,|,y,e,s,|]. It seems instead of splitting by "|", it splits every character. Anyone has any idea on it? Thanks!
split accepts a regular expression. | has a specific meaning in regular expressions, it expresses an alternation. To actually split on |, you have to escape it in the regex with a backslash. Since you specify the regex using a string literal, and backslashes are special in string literals, you have to escape that with another backslash:
String[] tokens = str.split("\\|");
In the general case, if you want to use the contents of a string literally, you can use Pattern.quote to automatically escape any special characters. You don't really need it here, but it's useful for end-user-entered values:
String[] tokens = str.split(Pattern.quote(stringToSplitOnLiterally));
This question already has answers here:
illegal string body character after dollar sign
(5 answers)
Closed 8 years ago.
I am using spock to test a java app.It seems "$" is a special character in groovy.any java string that is separated by "$" can't be separated in groovy properly.Any workaround for this problem?
update
The "split" happened in java code that I can't edit. It turns out that java code has a problem same as:Why can't I split a string with the dollar sign?
I don't think $ is a special character in Groovy strings. Edit: Yes, it is, if you use GStrings! But the rest may still be useful: But it's a special character in the string you give to String#split, because that string is interpreted as a regular expression, and in a regular expression, $ is "end of input" (or end of line, depending on flags).
If you're using String#split, to make it split on a literal $, you have to escape it with a backslash. To make the regex engine see a backslash, you have to escape the backslash in a string literal with another backslash.
Example:
'testing$one$two$three'.split('\\$').each {
println it
}
Output:
testing
one
two
three
Better yet, as suggested by Dónal, use tokenize:
Example:
'testing$one$two$three'.tokenize('$').each {
println it
}
(Same output)
This question already has answers here:
How do I split a string in Java?
(39 answers)
Closed 7 years ago.
I would like to parse entire file based on all the possible delimiters like commas, colon, semi colons, periods, spaces, hiphens etcs.
Suppose I have a hypothetical string line "Hi,X How-how are:any you?" I should get output array with items Hi,X,How,how,are,any and you.
How do I specify all these delimiter in String.split method?
Thanks in advance.
String.split takes a regular expression, in this case, you want non-word characters (regex \W) to be the split, so it's simply:
String input = "Hi,X How-how are:any you?";
String[] parts = input.split("[\\W]");
If you wanted to be more explicit, you could use the exact characters in the expression:
String[] parts = input.split("[,\\s\\-:\\?]");