Java regular expression: how to include '-' - java

I am using this pattern and matching a string.
String s = "//name:value /name:value";
if (s.matches("(//?\\s*\\w+:\\w+\\s*)+")) {
// it fits
}
This works properly.
But if I want to have a string like "/name-or-address:value/name-or-address:value" which has this '-' in second part, it doesn't work.
I am using \w to match A-Za-z_, but how can I include - in that?

Use [\w-] to combine both \w and -.
Note that - should always be at the beginning or end of a character class, otherwise it will be interpreted as defining a range of characters (for instance, [a-z] is the range of characters from a to z, whereas [az-] is the three characters a,z,and-).

I don't know if it answers your question but why not replacing \w+ with (\w|-)+ or [\w-]+ ?

[-\w] (Or in a string, [-\\w].)

How about
if (s.matches("/(/|\\w|-|:\\w)+")) {

Related

RegExp pattern for a String which contain 0 and 4-9(4 ,5,6,7,8,9)

I am dealing with a string. Use-Case is I don't want a String which has number any digit of 4 to 9 and 0.
Example:-
ABC0123-> Not Valid.
XYZ002456789->Not Valid.
ABC123->Valid
ABC1->Valid
I have tried below pattern but not got success in it.
String pattern = "^[0,4-9]+$";
if(str.matches(pattern)){
//do something.
}
First, remove the comma from the character class. You're not looking for commas.
Since you're disallowing, don't anchor the expression, allow the match anywhere in the string. In fact, matches anchors the expression for you, so we have to intentionally allow characters before and after the disallowed character class:
String pattern = ".*[04-9].*";
if(str.matches(pattern)){
// disallow
}
Live Example
Alternately, you can avoid having those .* in there by using Pattern.compile and then using the resulting Pattern instead of matches, since it won't automatically anchor the pattern like matches does.
It is much more easier to match those that contains 4-9 and 0 than to match those that don't. So you should just write a regex like this:
[4-90]
And call find, then invert the result:
if (!Pattern.compile("[4-90]").matcher(someString).find()) {
// ...
}
Another option could be to use a negated character class and add what you don't want to match. In this case you could add 0 and a range from 4-9 and if you don't want to match a carriage return or a newline you could add those as well.
^[^04-9\\r\\n]+$
Note that if you add the comma to the character class that it would mean a comma literally.
Regex demo | Java demo
String pattern = "^[^04-9\\r\\n]+$";
if(str.matches(pattern)){
//do something.
}

Validating a mathematical expression in java

I am trying to validate if the string "expression" as in the code below is a formula.
String expression = request.getParameter(FORMULA);
if(!Pattern.matches("[a-zA-Z0-9+-*/()]", expression)){return new AjaxMessage(AjaxMessage.ResponseStatusEnum.FAILURE, getJsonString(, "Manager.invalid.formula" , null));
}
examples of value for expression are {a+b/2, (a+b)*2,(john-Max),etc} just for the context (the variable names in the formula might vary and the arithmetic expression contains only [+-/()*] special characters. As you can see I tried to validate using regex (new to regex), but I think it's not possible as I don't know the length of the variable names.
Is there a way to achieve a validation using regex or any other library in java?
Thanks in advance.
The reason is you are using characters with special meaning in regex. You need to escape those characters. I have just modified yor regex to make it work.
Code:
List<String> expressions = new ArrayList<String>();
expressions.add("a+b/2");
expressions.add("(a+b)*2");
expressions.add("john-Max");
expressions.add("etc[");
for (String expression : expressions) {
if (!Pattern.matches("[a-zA-Z0-9\\+\\-\\*/\\(\\)]*", expression)) {
System.out.println("NOT match");
} else {
System.out.println("MATCH");
}
}
}
OUTPUT:
MATCH
MATCH
MATCH
NOT match
You're using special character in your regex, you need to escape them using \.
It should look like [a-zA-Z0-9+\\-*/()] . This only tests one character you need to add a * at the end to test multiple characters.
Edit (thanks Toto): because [] tests a single character, it's called a character class (not like a Java class actually), so only the -is considered special here. For a regex without the braces, you would neeed to escape the other special characters.
Special characters have special meaning using regex and won't be interpreted as the character they are (for example parenthesis are used to make groups, * means 0 or more of the previous character, etc.).
About character class: https://docs.oracle.com/javase/tutorial/essential/regex/char_classes.html
More info:
http://www.regular-expressions.info/characters.html and
http://www.regular-expressions.info/refcharacters.html
I use this site to test my regexes (note that regex engine may vary !):
https://regex101.com/
As said in comment, a mathematic expression is more than just different characters, so if you want to validate, you'll have to do more manual checking.

why string.matches["+-*/"] will report the pattern exception?

I have this code:
public static void main(String[] args) {
String et1 = "test";
String et2 = "test";
et1.matches("[-+*/]"); //works fine
et2.matches("[+-*/]"); //java.util.regex.PatternSyntaxException, why?
}
Because '-' is escape character? But why it will works fine, if '-' switchs with '+' ?
it is because - is used to define a range of characters in a character class. Since + is after * in the ascii table, the range has no sense, and you obtain an error.
To have a literal - in the middle of a character class, you must escape it. There is no problem if the - is at the begining or at the end of the class because it's unambigous.
An other situation where you don't need to escape the - is when you have a character class shortcut before, example:
[\\d-abc]
(other regex engines like pcre allows the same when the character class shortcut is placed after [abc-\d], but Java doesn't seem to allow this.)
- inside a character class (the [xxx]) is used to define a range, for example: [a-z] for all lower case characters. If you want to actually mean "dash", it has to be in first or last position. I generally place it first to avoid any confusions.
Alternatively you can escape it: [+\\-*/].
Just FYI, the Java regular expression meta characters are defined here:
The metacharacters supported by this API are: <([{\^-=$!|]})?*+.>
As a general rule, to save myself from regexp debugging headaches, if I want to use any of these characters as a literal then I precede them with a \ (Or \\ inside of a Java String expression).
Either:
et2.matches("[\\+\\-\\*/]");
Or:
et2.matches("[\\-\\+\\*/]");
Will work regardless of order.
I think you should use: [\-\+\*/]
Because: '-' to define range, eg: [a-d] it's mean: a,b,c,d

regex help in java

I'm trying to compare following strings with regex:
#[xyz="1","2"'"4"] ------- valid
#[xyz] ------------- valid
#[xyz="a5","4r"'"8dsa"] -- valid
#[xyz="asd"] -- invalid
#[xyz"asd"] --- invalid
#[xyz="8s"'"4"] - invalid
The valid pattern should be:
#[xyz then = sign then some chars then , then some chars then ' then some chars and finally ]. This means if there is characters after xyz then they must be in format ="XXX","XXX"'"XXX".
Or only #[xyz]. No character after xyz.
I have tried following regex, but it did not worked:
String regex = "#[xyz=\"[a-zA-z][0-9]\",\"[a-zA-z][0-9]\"'\"[a-zA-z][0-9]\"]";
Here the quotations (in part after xyz) are optional and number of characters between quotes are also not fixed and there could also be some characters before and after this pattern like asdadad #[xyz] adadad.
You can use the regex:
#\[xyz(?:="[a-zA-z0-9]+","[a-zA-z0-9]+"'"[a-zA-z0-9]+")?\]
See it
Expressed as Java string it'll be:
String regex = "#\\[xyz=\"[a-zA-z0-9]+\",\"[a-zA-z0-9]+\"'\"[a-zA-z0-9]+\"\\]";
What was wrong with your regex?
[...] defines a character class. When you want to match literal [ and ] you need to escape it by preceding with a \.
[a-zA-z][0-9] match a single letter followed by a single digit. But you want one or more alphanumeric characters. So you need [a-zA-Z0-9]+
Use this:
String regex = "#\\[xyz(=\"[a-zA-z0-9]+\",\"[a-zA-z0-9]+\"'\"[a-zA-z0-9]+\")?\\]";
When you write [a-zA-z][0-9] it expects a letter character and a digit after it. And you also have to escape first and last square braces because square braces have special meaning in regexes.
Explanation:
[a-zA-z0-9]+ means alphanumeric character (but not an underline) one or more times.
(=\"[a-zA-z0-9]+\",\"[a-zA-z0-9]+\"'\"[a-zA-z0-9]+\")? means that expression in parentheses can be one time or not at all.
Since square brackets have a special meaning in regex, you used it by yourself, they define character classes, you need to escape them if you want to match them literally.
String regex = "#\\[xyz=\"[a-zA-z][0-9]\",\"[a-zA-z][0-9]\"'\"[a-zA-z][0-9]\"\\]";
The next problem is with '"[a-zA-z][0-9]' you define "first a letter, second a digit", you need to join those classes and add a quantifier:
String regex = "#\\[xyz=\"[a-zA-z0-9]+\",\"[a-zA-z0-9]+\"'\"[a-zA-z0-9]+\"\\]";
See it here on Regexr
there could also be some characters before and after this pattern like
asdadad #[xyz] adadad.
Regex should be:
String regex = "(.)*#\\[xyz(=\"[a-zA-z0-9]+\",\"[a-zA-z0-9]+\"'\"[a-zA-z0-9]+\")?\\](.)*";
The First and last (.)* will allow any string before the pattern as you have mentioned in your edit. As said by #ademiban this (=\"[a-zA-z0-9]+\",\"[a-zA-z0-9]+\"'\"[a-zA-z0-9]+\")? will come one time or not at all. Other mistakes are also very well explained by Others +1 to all other.

Match "_<digit>string" wth a regular expression

I have a list of strings like
xxx_2pathway
xxx_6pathway
xxx_pathway
So I have a string followed by an underscore and "pathway". There may be a digit between the underscore and "pathway". How can I match and replace everything except xxx with a regular expression in Java?
This does not work:
pathnameRaw = pathnameRaw.replace("_\\dpathway","");
Your regex is almost fine. Since the digit is optional, add a ? at the end of \\d.
Also the replace method does not use regex. Use replaceAll instead.
See it
"_[0-9]?pathway"

Categories