Escape Sequence vs. Whitespace Character (\s) [duplicate] - java

This question already has answers here:
Version difference? Regex Escape in Java
(2 answers)
Closed 1 year ago.
Are escape sequences and whitespace characters the same thing? I'm not sure what else to write here but Stackoverflow said the first sentence is not enough so I'm typing this second sentence for no reason at all but that so this post will go through.

There are a few escape sequences specified in Java, of which \s is not part. The \s is recognized as whitespace in regular expressions, where it is a predefined character class.
Check the following sections from the Java Tutorial:
Escape Sequences
Predefined Character Classes

Related

Can someone explain regex this regex expression [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 2 years ago.
Hi there I'm new to Java and was going through some information on regex and I couldn't comprehend this the following expression:
"^[a-zA-Z\-]+$"
Could someone be kind enough to explain each and every character in this expression?
Thank you.
^ $ # Check if the entire string matches,
[ ]+ # with one or more of the following characters:
a-z # Any lowercase (ASCII) letter
A-Z # Any uppercase (ASCII) letter
\- # Or an "-" (the `\` is used to escape it)
Or in short: this regex checks if a given string consists solely of (ASCII) letters and/or -, and is non-empty.
Try it online.
[a-zA-Z] means all characters a through or A through Z, inclusive.
The "\" inside the square bracket is used as an escape character.
Symbol "+" in the end signified that your regex can occur once or more times.

Regular expression to match a word starts and ends with a combination of special characters [duplicate]

This question already has answers here:
Java - Best way to grab ALL Strings between two Strings? (regex?)
(3 answers)
Closed 4 years ago.
I am trying to write a regular expression which gives me words which starts with <!= and ends with =>. For example if there is a sentence what is your <!=name=>, the result should give me name because it matches my pattern.
I have read to use this ^ for starts with and $ for ends with, but I am not able to match a combination of special characters.
As in the comment. You can use <!=(\w+)=> because the exclamation mark and equal sign are not part of word-character class you can simply test for those characters and match the word characters between them. check:https://regex101.com/r/qDrobh/4
For multiple words you can use:<!=((?:\w+| )*)=>
See:https://regex101.com/r/qDrobh/5

how to separate a java string that is separated by "$"? [duplicate]

This question already has answers here:
illegal string body character after dollar sign
(5 answers)
Closed 8 years ago.
I am using spock to test a java app.It seems "$" is a special character in groovy.any java string that is separated by "$" can't be separated in groovy properly.Any workaround for this problem?
update
The "split" happened in java code that I can't edit. It turns out that java code has a problem same as:Why can't I split a string with the dollar sign?
I don't think $ is a special character in Groovy strings. Edit: Yes, it is, if you use GStrings! But the rest may still be useful: But it's a special character in the string you give to String#split, because that string is interpreted as a regular expression, and in a regular expression, $ is "end of input" (or end of line, depending on flags).
If you're using String#split, to make it split on a literal $, you have to escape it with a backslash. To make the regex engine see a backslash, you have to escape the backslash in a string literal with another backslash.
Example:
'testing$one$two$three'.split('\\$').each {
println it
}
Output:
testing
one
two
three
Better yet, as suggested by Dónal, use tokenize:
Example:
'testing$one$two$three'.tokenize('$').each {
println it
}
(Same output)

Check if string contains CJK (chinese) characters [duplicate]

This question already has answers here:
Use regular expression to match ANY Chinese character in utf-8 encoding
(7 answers)
Closed 9 years ago.
I need to check if a string contains chinese characters.
After searching i found that i have to look with the regex on this pattern \u31C0-\u31EF,
But i don't manage to get the regex work.
Anyone experienced with this situation ? is the regex correct ?
As discussed here, in Java 7 (i.e. regex compiler meets requirement RL1.2 Properties from UTS#18 Unicode Regular Expressions), you can use the following regex to match a Chinese (well, CJK) character:
\p{script=Han}
which can be appreviated to simply
\p{Han}

Splitting on "," but not "\," [duplicate]

This question already has answers here:
How to split a comma separated String while ignoring escaped commas?
(6 answers)
Closed 9 years ago.
I'm looking for a regular expression to match , but ignore \, in Java's regex engine. This comes close:
[^\\],
However, it matches the previous character (in addition to the comma), which won't work.
Perhaps the regular expression approach is the wrong one altogether. I was intending to use String.split() to parse a simple CSV file (can't use an external library) with escaped commas.
You need a negative look-behind assertion here:
String[] arr = str.split("(?<![^\\\\]\\\\),");
Note that you need 4 backslashes there. First escape the backslash for Java string literal. And then again escape both the backslashes for regex.

Categories