Java Regular expression exclude special characters [duplicate] - java

This question already has answers here:
Regular expression for excluding special characters [closed]
(11 answers)
Closed 9 years ago.
I have to validate a String to check if it contains a special character or not. This string may contain any number and words ( including unicode, ie: à, â,ô .. ) but should not accept any special characters ( ie: !,#,#,%,^ ... )
Sorry for my bad English.

This character class
[\p{L}\p{No}\p{Space}]
will include all characters which Unicode declares as either "letters", "numbers", or "whitespace characters". If you want to match a string against such a character class, you would write the following:
input.matches("[\\p{L}\\p{No}\\p{Space}]+")
For future reference, I have extracted all this information from the java.util.Pattern class. You should refer to that page for all your future interest in the Java regular expressions.

You can try [\\p{L}\\s]+. as example. This will remove special characters.
Pattern p=Pattern.compile("[\\p{L}\\s]+");
Matcher m=p.matcher("hissd#");
if(m.find()){
System.out.println(m.group(0));//
}

Related

Replacing Regular expression matches in Java [duplicate]

This question already has answers here:
My regex is matching too much. How do I make it stop? [duplicate]
(5 answers)
Closed 4 years ago.
I want to replace &sp; in the string below with Z.
Input text : ABCD&sp;EF&p;GHIJ&bsp;KL
Output text : ABCDZEFZGHIZKL
Can anyone tell me how to replace the every instance of &\D+; using java regular expression?
I am using /(&\D+;)?/ but it doesn't work.
Use String#replaceAll.
You also should use the ? modificator to +:
String str = "ABCD&sp;EF&p;GHIJ&bsp;KL";
String regex = "&\\D+?;";
System.out.println (str.replaceAll(regex,"Z"));
This should work
Match the initial &, then all characters that are not the tailing ;, then that tailing ; like so: &[^;]+; If not matching numbers (as suggested by your example with \D) is a requirement, add the numbers to the negated character set: [^;0-9] To make it replace all occurrences, add the global flag g. The site regexr.com is a handy tool to create regexes.
Edit: Sorry, I initially read your question wrong.

Regular expression to match a word starts and ends with a combination of special characters [duplicate]

This question already has answers here:
Java - Best way to grab ALL Strings between two Strings? (regex?)
(3 answers)
Closed 4 years ago.
I am trying to write a regular expression which gives me words which starts with <!= and ends with =>. For example if there is a sentence what is your <!=name=>, the result should give me name because it matches my pattern.
I have read to use this ^ for starts with and $ for ends with, but I am not able to match a combination of special characters.
As in the comment. You can use <!=(\w+)=> because the exclamation mark and equal sign are not part of word-character class you can simply test for those characters and match the word characters between them. check:https://regex101.com/r/qDrobh/4
For multiple words you can use:<!=((?:\w+| )*)=>
See:https://regex101.com/r/qDrobh/5

Escape Sequence vs. Whitespace Character (\s) [duplicate]

This question already has answers here:
Version difference? Regex Escape in Java
(2 answers)
Closed 1 year ago.
Are escape sequences and whitespace characters the same thing? I'm not sure what else to write here but Stackoverflow said the first sentence is not enough so I'm typing this second sentence for no reason at all but that so this post will go through.
There are a few escape sequences specified in Java, of which \s is not part. The \s is recognized as whitespace in regular expressions, where it is a predefined character class.
Check the following sections from the Java Tutorial:
Escape Sequences
Predefined Character Classes

Check if string contains CJK (chinese) characters [duplicate]

This question already has answers here:
Use regular expression to match ANY Chinese character in utf-8 encoding
(7 answers)
Closed 9 years ago.
I need to check if a string contains chinese characters.
After searching i found that i have to look with the regex on this pattern \u31C0-\u31EF,
But i don't manage to get the regex work.
Anyone experienced with this situation ? is the regex correct ?
As discussed here, in Java 7 (i.e. regex compiler meets requirement RL1.2 Properties from UTS#18 Unicode Regular Expressions), you can use the following regex to match a Chinese (well, CJK) character:
\p{script=Han}
which can be appreviated to simply
\p{Han}

Splitting on "," but not "\," [duplicate]

This question already has answers here:
How to split a comma separated String while ignoring escaped commas?
(6 answers)
Closed 9 years ago.
I'm looking for a regular expression to match , but ignore \, in Java's regex engine. This comes close:
[^\\],
However, it matches the previous character (in addition to the comma), which won't work.
Perhaps the regular expression approach is the wrong one altogether. I was intending to use String.split() to parse a simple CSV file (can't use an external library) with escaped commas.
You need a negative look-behind assertion here:
String[] arr = str.split("(?<![^\\\\]\\\\),");
Note that you need 4 backslashes there. First escape the backslash for Java string literal. And then again escape both the backslashes for regex.

Categories