I am trying to create a Regex pattern for <String>-<String>. This is my current pattern:
(\w+\-\w+).
The first String is not allowed to be "W". However, it can still contain "W"s if it's more than one letter long.
For example:
W-80 -> invalid
W42-80 -> valid
How can this be achieved?
So your first string can be either: one character but not W or 2+ characters. Simple pattern to achieve that is:
([^W]|\w{2,})-\w+
But this pattern is not entirely correct, because now it allows any character for first part, but originally only \w characters were expected to be allowed. So correct pattern is:
([\w&&[^W]]|\w{2,})-\w+
Pattern [\w&&[^W]] means any character from \w character class except W character.
Just restrict the last char to "any word char except 'W'".
There are a couple of ways to do this:
Negative look-behind (easy to read):
^\w+(?<!W)-\w+$
See live demo.
Negated intersection (trainwreck to read):
^\w*[\w&&[^W]]-\w+$
See live demo.
——
The question has shifted. Here’s a new take:
^.+(?<!^W)-\w+
This allows anything as the first term except just "W".
Related
I need to check if a String matches this specific pattern.
The pattern is:
(Numbers)(all characters allowed)(numbers)
and the numbers may have a comma ("." or ",")!
For instance the input could be 500+400 or 400,021+213.443.
I tried Pattern.matches("[0-9],?.?+[0-9],?.?+", theequation2), but it didn't work!
I know that I have to use the method Pattern.match(regex, String), but I am not being able to find the correct regex.
Dealing with numbers can be difficult. This approach will deal with your examples, but check carefully. I also didn't do "all characters" in the middle grouping, as "all" would include numbers, so instead I assumed that finding the next non-number would be appropriate.
This Java regex handles the requirements:
"((-?)[\\d,.]+)([^\\d-]+)((-?)[\\d,.]+)"
However, there is a potential issue in the above. Consider the following:
300 - -200. The foregoing won't match that case.
Now, based upon the examples, I think the point is that one should have a valid operator. The number of math operations is likely limited, so I would whitelist the operators in the middle. Thus, something like:
"((-?)[\\d,.]+)([\\s]*[*/+-]+[\\s]*)((-?)[\\d,.]+)"
Would, I think, be more appropriate. The [*/+-] can be expanded for the power operator ^ or whatever. Now, if one is going to start adding words (such as mod) in the equation, then the expression will need to be modified.
You can see this regular expression here
In your regex you have to escape the dot \. to match it literally and escape the \+ or else it would make the ? a possessive quantifier. To match 1+ digits you have to use a quantifier [0-9]+
For your example data, you could match 1+ digits followed by an optional part which matches either a dot or a comma at the start and at the end. If you want to match 1 time any character you could use a dot.
Instead of using a dot, you could also use for example a character class [-+*] to list some operators or list what you would allow to match. If this should be the only match, you could use anchors to assert the start ^ and the end $ of the string.
\d+(?:[.,]\d+)?.\d+(?:[.,]\d+)?
In Java:
String regex = "\\d+(?:[.,]\\d+)?.\\d+(?:[.,]\\d+)?";
Regex demo
That would match:
\d+(?:[.,]\d+)? 1+ digits followed by an optional part that matches . or , followed by 1+ digits
. Match any character (Use .+) to repeat 1+ times
Same as the first pattern
The question
java nio negate a glob pattern
asked how to make a glob pattern matching strings that do not start with a given character, say "a". The accepted answer
"[!a]*"
does work for starting characters and also for ending characters,
"*[!a]"
However, it does not work for positions in between. For example
"*[!.]*"
does not filter out file names with a dot somewhere inside the file name. (While, of course,
"*.*"
does filter out file names without a dot.) How can I do inner character negation?
It works perfectly fine in the middle of matcher. The thing to realize is that foo.bar DOES match *[!.]*
To show that this is a match:
Let the first star match foo.b. This is allowed since it can match any string of any length.
The next character is not a period, so [!.] matches a
Let the second star match the remainder, r
This is the complete input, and therefore foo.bar matches *[!.]*.
The pattern matches "any string that contains a character that is not a period". You instead wanted "any string that does not contain any periods anywhere".
In regex, this is the difference between ^.*[^.].*$ and ^([^.])*$.
Unfortunately, globs are not powerful enough to express what you want.
What I need to do is to determine whether a word consists of letters except certain letters. For example I need to test whether a word consists of the letters from the English alphabet except letters: I, V and X.
Currently I have this long regex for the simple task above:
Pattern pattern = Pattern.compile("[ABCDEFGHJKLMNOPQRSTUWYZ]+");
Any of you know any shorthand way of excluding certain letters from a Java regex? Thanks.
You can use the && operator to create a compound character class using subtraction:
String regex = "[A-Z&&[^IVX]]+";
You could simply specify character ranges inside your character class:
[A-HJ-UWYZ]+
Just use a negative lookahead in your pattern.
Pattern pattern = Pattern.compile("^(?:(?![IVX])[A-Z])+$");
DEMO
'&&' didn't work for me,
used: (?:(?![IVX])[A-Z])
Use [A-Z&&[^IVX]]+ to exclude certain characters from the A-Z range - see Pattern
I have a series of strings that I am searching for a particular combination of characters in. I am looking for a digit, following by the letter m or M, followed by a digit, then followed by the letter f or F.
An example string is - "Class (4) 1m5f Good" - The text in bold is what I want to extract from the string.
Here is the code I have, that doesn't work.
Pattern distancePattern = Pattern.compile("\\^[0-9]{1}[m|M]{1}[0-9]{1}[f|F]{1}$\\");
Matcher distanceMatcher = distancePattern.matcher(raceDetails.toString());
while (distanceMatcher.find()) {
String word= distanceMatcher.group(0);
System.out.println(word);
}
Can anyone suggest what I am doing wrong?
The ^ and $ characters at the start and end of your regex are anchors - they're limiting you to strings that only consist of the pattern you're looking for. The first step is to remove those.
You can then either use word boundaries (\b) to limit the pattern you're looking for to be an entire word, like this:
Pattern distancePattern = Pattern.compile("\\b\\d[mM]\\d[fF]\\b");
...or, if you don't mind your pattern appearing in the middle of a word, e.g., "Class (4) a1m5f Good", you can drop the word boundaries:
Pattern distancePattern = Pattern.compile("\\d[mM]\\d[fF]");
Quick notes:
You don't really need the {1}s everywhere - the default assumption
is that a character or character class is happening once.
You can
replace the [0-9] character class with \d (it means the same
thing).
Both links are to regular-expressions.info, a great resource for learning about regexes that I highly recommend you check out :)
I'd use word boundaries \b:
\b\d[mM]\d[fF]\b
for java, backslashes are to be escaped:
\\b\\d[mM]\\d[fF]\\b
{1} is superfluous
[m|M] means mor | or M
For the requirement of a digit, following by the letter m or M, followed by a digit, then followed by the letter f or F regex can be simplified to:
Pattern distancePattern = Pattern.compile("(?i)\\dm\\df");
Where:
(?i) - For ignore case
\\d - For digits [0-9]
I'm trying to compare following strings with regex:
#[xyz="1","2"'"4"] ------- valid
#[xyz] ------------- valid
#[xyz="a5","4r"'"8dsa"] -- valid
#[xyz="asd"] -- invalid
#[xyz"asd"] --- invalid
#[xyz="8s"'"4"] - invalid
The valid pattern should be:
#[xyz then = sign then some chars then , then some chars then ' then some chars and finally ]. This means if there is characters after xyz then they must be in format ="XXX","XXX"'"XXX".
Or only #[xyz]. No character after xyz.
I have tried following regex, but it did not worked:
String regex = "#[xyz=\"[a-zA-z][0-9]\",\"[a-zA-z][0-9]\"'\"[a-zA-z][0-9]\"]";
Here the quotations (in part after xyz) are optional and number of characters between quotes are also not fixed and there could also be some characters before and after this pattern like asdadad #[xyz] adadad.
You can use the regex:
#\[xyz(?:="[a-zA-z0-9]+","[a-zA-z0-9]+"'"[a-zA-z0-9]+")?\]
See it
Expressed as Java string it'll be:
String regex = "#\\[xyz=\"[a-zA-z0-9]+\",\"[a-zA-z0-9]+\"'\"[a-zA-z0-9]+\"\\]";
What was wrong with your regex?
[...] defines a character class. When you want to match literal [ and ] you need to escape it by preceding with a \.
[a-zA-z][0-9] match a single letter followed by a single digit. But you want one or more alphanumeric characters. So you need [a-zA-Z0-9]+
Use this:
String regex = "#\\[xyz(=\"[a-zA-z0-9]+\",\"[a-zA-z0-9]+\"'\"[a-zA-z0-9]+\")?\\]";
When you write [a-zA-z][0-9] it expects a letter character and a digit after it. And you also have to escape first and last square braces because square braces have special meaning in regexes.
Explanation:
[a-zA-z0-9]+ means alphanumeric character (but not an underline) one or more times.
(=\"[a-zA-z0-9]+\",\"[a-zA-z0-9]+\"'\"[a-zA-z0-9]+\")? means that expression in parentheses can be one time or not at all.
Since square brackets have a special meaning in regex, you used it by yourself, they define character classes, you need to escape them if you want to match them literally.
String regex = "#\\[xyz=\"[a-zA-z][0-9]\",\"[a-zA-z][0-9]\"'\"[a-zA-z][0-9]\"\\]";
The next problem is with '"[a-zA-z][0-9]' you define "first a letter, second a digit", you need to join those classes and add a quantifier:
String regex = "#\\[xyz=\"[a-zA-z0-9]+\",\"[a-zA-z0-9]+\"'\"[a-zA-z0-9]+\"\\]";
See it here on Regexr
there could also be some characters before and after this pattern like
asdadad #[xyz] adadad.
Regex should be:
String regex = "(.)*#\\[xyz(=\"[a-zA-z0-9]+\",\"[a-zA-z0-9]+\"'\"[a-zA-z0-9]+\")?\\](.)*";
The First and last (.)* will allow any string before the pattern as you have mentioned in your edit. As said by #ademiban this (=\"[a-zA-z0-9]+\",\"[a-zA-z0-9]+\"'\"[a-zA-z0-9]+\")? will come one time or not at all. Other mistakes are also very well explained by Others +1 to all other.