Why regular expression in Java cannot recognize \s as space character? - java

I read from a lot of webpage (for example: http://www.wellho.net/regex/java.html), they all mentioned that \s could represent any space charactor. But when I use \s in Java, it is not an eligible expression.
Anyone know the reason?

Backslashes inside strings need to be quoted in order to work.
For example, the following works fine:
public class testprog {
public static void main(String args[]) {
String s = "Hello there";
System.out.println (s.matches(".*\\s.*"));
}
}
outputting:
true
If you use a string like "\s", you should get an error along the lines of:
Invalid escape sequence - valid ones are \b \t \n \f \r \" \' \\
from your compiler since \s is not a valid escape sequence (for strings, I mean, not regexes).

Related

Check whether the string contains backslash or not?

In Java, \' denotes a single quotation mark (single quote) character, and \" denotes a double quotation mark (double quote) character.
So, String s = "I\'m a human."; works well.
However, String s = "I'm a human." does not make any compile errors, either.
Likewise, char c = '\"'; works, but char c = '"'; also works.
But I need to detect whether the string contains backslash or not:
"abcd'" does not contain backslash
"abcd\'" contains backslash.
I need to distinguish whether the string contains backslash or not.
You can't. The're called escape sequences for a reason. For example, \n once put in a String, cannot match a literal \ against itself. It's gone. All that's left, is a new-line.
Remember \ is used to escape a character. It itself doesn't remain a part of the String.
However, you can check for a literal \ by doing a simple contains like
String s = "abcd\\";
System.out.println(s.contains("\\"));
"abcd\" is not a valid string in java.
Here java treated \" as an escape sequence character("). So, if you want to put a backslash in a string then you need to use \ with escape sequence character.
String "abcd\'" has not contained backslash character. It has an escape sequence character \'.
Escape characters (also called escape sequences or escape codes) in
general are used to signal an alternative interpretation of a series
of characters. In Java, a character preceded by a backslash (\) is an
escape sequence and has special meaning to the java compiler.
When an escape sequence is encountered in a print statement, the
compiler interprets it accordingly. For example, if you want to put
quotes within quotes you must use the escape sequence, \", on the
interior quotes. To print the sentence: She said "Hello!" to me. you
should write:
System.out.println("She said \"Hello!\" to me.");
// Java program to illustrate to find a character
// in the string.
import java.io.*;
public static void main (String[] args)
{
// This is a string in which a character
// to be searched.
String str = "gee\\k";
// Returns index of first occurrence of character.
int firstIndex = str.indexOf('\\');
System.out.println("First occurrence of char '\\'" +
" is found at : " + firstIndex);
}
if(string.contains("\\")){
//TODO do your code here
}
\ is used as for escape sequence in Java.
If you want to print backslash in the string you just have to print "abcd\\".
For your example it would be:
boolean containsBs = "abcd\\".contains("\\");
When you are using Strings you do not need to use the escape character(backslash) for single quotation marks. Likewise when using char you do not need to escape the double quotation mark.
String use double quotation mark while chars use single quotation mark. You need to use the escape character for double quote in Strings and for simple quote in chars.
String ex="I'm an example";
String ex2="My name is \"example\"";
char c='"';
char c2='\'';
If you want to find out if a String contains backslash
String ex="abcd";
String ex2="abcd\\";
ex.contains("\\"); //false
ex.contains("\\"); //true
The first backslash is for escaping and the second is the character.

string validation over regular expressions in java

How to validate the given string over the regular expression (XSD Pattern):
xsd pattern:'([a-zA-Z0-9.,;:'+-/()?*[]{}\`´~
]|[!"#%&<>÷=#_$£]|[àáâäçèéêëìíîïñòóôöùúûüýßÀÁÂÄÇÈÉÊËÌÍÎÏÒÓÔÖÙÚÛÜÑ])*'
I need to validate the string with above pattern whether it matches or not.
I have tried the below code but getting unsupported escape characters error while compiling
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class PatternMatching {
private static Pattern usrNamePtrn = Pattern.compile("([a-zA-Z0-9\.,;:'\+\-/\(\)?\*\[\]\{\}\\`´~ ]|[!"#%&<>÷=#_$£]|[àáâäçèéêëìíîïñòóôöùúûüýßÀÁÂÄÇÈÉÊËÌÍÎÏÒÓÔÖÙÚÛÜÑ])*");
public static boolean validateUserName(String userName){
Matcher mtch = usrNamePtrn.matcher(userName);
if(mtch.matches()){
return true;
}
return false;
}
public static void main(String a[]){
System.out.println("Is a valid username?"+validateUserName("stephen & john"));
}
}
how to do the above task, in addition to that if the doesn't match with the pattern then that characters need to be displayed.and I am using java 1.6 any suggestions is appreciated
First, the regular expression itself has three mistakes.
Mistake 1:
A backslash is a special character which is used to escape whatever character follows it. Therefore, the sequence
\`
is either identical to a single back-quote, or, depending on the regular expression engine, is an illegal escape sequence. Either way, if the intent was to match a backslash along with all the other characters, it should be written as:
\\`
Mistake 2:
Inside the […] character grouping, a ] must be escaped so it doesn’t signify the end of the grouping. So, [] needs to be written as [\].
Mistake 3:
Inside the […] character grouping, a - indicates a character range, like a-z. The regular expression [+-/] does not mean “plus or hyphen or slash”; it means “any of the characters between plus and slash, inclusive.” Technically, this mistake doesn’t affect the outcome in this particular case, because +-/ is equivalent to those three literal characters plus the comma and period, which both happen to occur earlier in the character grouping anyway. But, in the interest of saying what you mean, the - should be escaped:
+\-/
Second is the matter of turning the regular expression into a Java string.
The backslash and the double-quote are special characters in Java. Obviously, " denotes the start and end of a String literal, so if you want a " inside a String, you must escape it:
\"
This is not related to regular expressions; this just tells the compiler that the String contains a double-quote character. It will be compiled into a single " and that is what the regular expression engine will see.
Finally, there is the matter of backslashes. It just so happens that, while regular expressions use a backslash to escape characters as described above, Java also uses backslashes to escape characters in strings. This means that if you want a literal backslash in a Java String, it must be written in the code as two backslashes:
String s = "\\"; // a String of length 1
Recall from above that we need a regular expression with consecutive backslash characters:
\\`
A Java string containing those three characters would look like this:
String s = "\\\\`"; // a String of length 3
A regular expression allows a backslash almost anywhere; for instance, \% is the same as %. However, Java only allows specific characters to be preceded by a single backslash. \+ is not one of those permitted sequences.
+, (, ), {, and } are not special characters inside a […] grouping, so there is no need to escape them anyway.
So, your code needs to be changed from this:
private static Pattern usrNamePtrn = Pattern.compile("([a-zA-Z0-9\.,;:'\+\-/\(\)?\*\[\]\{\}\\`´~ ]|[!"#%&<>÷=#_$£]|[àáâäçèéêëìíîïñòóôöùúûüýßÀÁÂÄÇÈÉÊËÌÍÎÏÒÓÔÖÙÚÛÜÑ])*");
to this:
private static Pattern usrNamePtrn = Pattern.compile("([a-zA-Z0-9.,;:'+\\-/()?*\\[\\]{}\\\\`´~ ]|[!\"#%&<>÷=#_$£]|[àáâäçèéêëìíîïñòóôöùúûüýßÀÁÂÄÇÈÉÊËÌÍÎÏÒÓÔÖÙÚÛÜÑ])*");
This is because " is a special character in Java.
You'll have to substitute " with an escape character i.e. \" and \ with \\ as follows:
private static Pattern usrNamePtrn = Pattern.compile("([a-zA-Z0-9.,;:'+-/()?*[]{}\\`´~ ]|[!\"#%&<>÷=#_$£]|[àáâäçèéêëìíîïñòóôöùúûüýßÀÁÂÄÇÈÉÊËÌÍÎÏÒÓÔÖÙÚÛÜÑ])*");
Note the change in the pattern below where " and \ have been replaced by \" and \\:
Also, note that this will only fix the Compile Issues. You need to re-check your Regex to see if it works fine.

Java Regex does not match newline

My code is as follows:
public class Test {
static String REGEX = ".*([ |\t|\r\n|\r|\n]).*";
static String st = "abcd\r\nefgh";
public static void main(String args[]){
System.out.println(st.matches(REGEX));
}
}
The code outputs false. In any other cases it matches as expected, but I can't figure out what the problem here is.
You need to remove the character class.
static String REGEX = ".*( |\t|\r\n|\r|\n).*";
You can't put \r\n inside a character class. If you do that, it would be treated as \r, \n as two separate items which in-turn matches either \r or \n. You already know that .* won't match any line breaks so, .* matches the first part and the next char class would match a single character ie, \r. Now the following character is \n which won't be matched by .*, so your regex got failed.
UPDATE:
Based on your comments, you need something like this:
.*(?:[ \r\n\t].*)+
EXPLANATION:
In plain words, it is a regex that matches a line, then 1 or more lines. Or, just a multiline text.
.* - 0 or more characters other than a newline
(?:[ \r\n\t].*)+ - a non-capturing group that matches 1 or more times a sequence of
[ \r\n\t] - either a space, or a \r or \n or \t
.* - 0 or more characters other than a newline
See demo
Original answer
You can fix your pattern 2 ways:
String REGEX = ".*(?:\r\n|[ \t\r\n]).*";
This way we match either \r\n sequence, or any character in the character class.
Or (since the character class only matches 1 character, we can add + after it to capture 1 or more:
String REGEX = ".*[ \t\r\n]+.*";
See IDEONE demo
Note that it is not a good idea to use single characters in alternations, it decreases performance.
Also note that capturing groups should not be overused. If you do not plan to use the contents of the group, use non-capturing groups ((?:...)), or remove them.

how to append a backquote in java string

Someone please help me how to append a backquote in a string in java.
I tried with String result="`"+value+"`";
I need to append the backquote before and after the string.
What is the correct way?
You should scape " using \"
Eg: "\""+value+"\"";
Ideone demo.
Edit: OP want to use backquote(```) so answer will be following and no need to scape.
"`"+value+"`"
Ideone demo.
When we use literal strings in Java, we use the quote (") character to indicate the beginning and ending of a string. For example, to declare a string called myString, we could this :-
String myString = "this is a string";
But what if we wanted to include a quote (") character WITHIN the string. We can use the \ character to indicate that we want to include a special character, and that the next character should be treated differently. \" indicates a quote character, not the termination of a string.
public static void main (String args[])
{
System.out.println ("If you need to 'quote' in Java");
System.out.println ("you can use single \' or double \" quote");
}
This allows us to include quote characters within a string.

String cannot contain java regex metacharacter

String should not contain java regex meta character like \d,\D all..
How we can handle programatically. It can contain *, . ? , + but not the java regex metacharacter.
public static void main(String[] args) throws Exception {
String s="\\d\\D";
if(s.contains("\\d")||s.contains("\\D")||s.contains("\\w"))
{
System.out.println("Should Not Contain");
}
}
Something like this...
O/p : Should Not Contain
Maybe you are looking for Pattern#quote. It returns a String with escaped regex special character for use as literal.
You can also use Matcher#quoteReplacement to escape special characters in a String that is intended to be used as replacement.
Are you looking for escape characters in Java so you can write \d and \D in a string? ("\d" and "\D").
See also:
http://docs.oracle.com/javase/tutorial/java/data/characters.html
What are all the escape characters in Java?

Categories