What's wrong with this pattern? [duplicate] - java

This question already has answers here:
What is the difference between the two regex pattern
(2 answers)
Closed 7 years ago.
Can anybody please tell me what's wrong with this regexp
"^[a-zA-Z0-9 -\\/_&()']*$"
I expect this to accept only values like abc123/-_'s but I'm not sure why it's even accepting ABC
But it's not accepting double quotes in it.
Here is my code:
public static final Pattern
PATTREN = Pattern.compile("^[a-zA-Z0-9 -\\/_&()']*$");
Matcher m = PATTREN .matcher("ABC\"");
return m.matches();

I believe this is what you want
Only character that needs escaping within [] is the hyphen - which has special meaning. Everything else is literal, even brackets, slashes etc. which usually have meaning.
No need for start and end markers
You can use Pattern.CASE_INSENSITIVE flag in pattern rather than added extra complexity via A-Za-z
Code
Example
Pattern pattern = Pattern.compile("[a-z0-9\\-/_&()']*", Pattern.CASE_INSENSITIVE);
System.out.println(pattern.matcher("abc123/-_'s").matches());
System.out.println(pattern.matcher("ABC\"").matches());

The pattern says, A-Z, so why do you think it would not allow uppercase characters?
You are defining a character class, using []; and anything between the square brackets is seen as "valid" character. So if you dont want uppercase letters, then remove A-Z from the [].
Btw: although it is not supporting Java, you might want to play with https://regex101.com/
That is a great site to "practice" regular expressions; and it isnt too hard to "convert" regexes from other languages to Java.

Related

Find strings with regex expression [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
I want to search into Java packages using the following expression:
com.company.*
Test example: https://regex101.com/r/tHTQd9/2
But when I use it into Java code it's not finding anything. Do I need to put some escape characters for .?
The following expression would work:
\bcom\.company\.\w[\w\.]*\b
Match between word-boundaries
Use literal dot characters by escaping
1 alphanumeric (or underscore) followed by 0 or more alphanumerics or dots
Pattern regex = Pattern.compile("\\bcom\\.company\\.\\w[\\w\\.]*\\b");
If you are looking for a word or more in the last sequence you can try:
com\\.company\\.\w+
Or, even more generic (any other character or more):
com\\.company\\..+
Please remember that this is quite generic and prone to errors.
If you provide a more detailed explanation or constraints we can help building a better RegEx.
Why double backslash in Java?
We know that the backslash character is an escape character in Java
String literals as well. Therefore, we need to double the backslash
character when using it to precede any character (including the \
character itself).
Source
In java to escape dot (.) you need to append double backslash (\\) so your regex will be like this:
com\\.company\\.*
Why double backslash is needed:
As dot(.) is a special symbol in regex so you need to escape it using a backslash (\) but as backslash also works as an escape character in java so it will be removed by java after processing the string. In order to preserve it, we need to add another backslash (\)
Regex string you will see
com\\.company\\.*
String after java processed it which will be the input as regex
com\.company\.*

Regex will quite if the first character does not match the pattern, even if there should be a match later [duplicate]

This question already has answers here:
Difference between matches() and find() in Java Regex
(5 answers)
Closed 3 years ago.
I am using regex in java, and I cannot create a regex to match what I want it to. I want to match everything in a string that begins and ends with a character.
"cats-are-cute" should match and return cats-are-cute
!!!DOG-CAT!!! should match and return DOG-CAT
I am using https://regexr.com/ to test, and it says my regex should work
I'm not even sure how I should attempt to fix this. I've found out that it will quite if the very first character does not match (e.i it is a special character) but it will match if the entire string begins + ends with a matching character.
It will not match if a special character begins or ends the entire string
Here is my code:
Pattern pattern = Pattern.compile("([A-Za-z0-9].*[A-Za-z0-9])");
Matcher matcher = pattern.matcher(word);
if(matcher.matches())
{
System.out.println("Matches");
System.out.println(matcher.start());
System.out.println(matcher.end());
}
if I type
testing
it returns
Matches
0
7
Small question: why is it 7 and not 6?
just like it should
but if I do "testing" matcher.matches() is false.
I think it should output
Matches
1
7
but sadly it does not as matcher.matches() returns false.
I think my regex is working, because quite a few sites have said that my regex will match what I want it to.
Am I missing something with Matcher matches()? Does it not do what I think it does?
I just needed to use find instead of matches, as OH GOD SPIDERS suggested in this comment:
As the documentation of Matcher.matches states it Attempts to match the entire region against the pattern.. You need to use Matcher.find if you don't want your entire String to be matched.

Regular expression with captured groups for parsing a range of two real numbers [duplicate]

This question already has answers here:
simple java regex throwing illegalstateexception [duplicate]
(3 answers)
Closed 5 years ago.
I need a regex to properly parse ranges of two real numbers (unsigned), presented with a hyphen.
Valid inputs:
1-3
3.14-7.50
0-4.01
It's Java on Android.
My current approach:
Pattern pattern = Pattern.compile("(?<Minimum>\\d+(\\.\\d+))-(?<Maximum>\\d+(\\.\\d+))");
Matcher matcher = pattern.matcher("3.14-5.2");
String min = matcher.group("Minimum");
String max = matcher.group("Maximum");
It crashes on attempting to retrieve the minimum.
java.lang.IllegalStateException: No match found
at java.util.regex.Matcher.getMatchedGroupIndex(Matcher.java:1314)
at java.util.regex.Matcher.group(Matcher.java:572)
I can't really see what's wrong with the expression.
I would particularly appreciate an explanation on what the problem with it is. A regex allowing for optional white space around the hyphen would be extra nice, too (I'd like it to work that way but I dropped this for now as I can't get it to work at all).
You need to make decimal part optional:
Pattern pattern = Pattern.compile(
"(?<Minimum>\\d+(?:\\.\\d+)?)-(?<Maximum>\\d+(?:\\.\\d+)?)");
? after (?:\\.\\d+) will make that group an optional match
Better to use ?: for making it a non-capturing group
Also you need to call matcher.find() or matcher.matches() before calling .group(int) method.

Regex to Match: !$%^&*()_+|~-=`{}[]:";'<>?,./

I know this a duplicate of a question with almost the identical name, however, I can't get it to work in Android what so ever!
I am trying this: Regex to Match Symbols:
public Pattern bsymbols = Pattern.compile("[-!$%^&*()_+|~=`{}\\[\\]:\\";'<>?,.\/]");
However, this isn't working. Does anyone know the correct method of applying this pattern?
P.S. Complete noob at Regex. :D
From here originally - Regex to Match Symbols: !$%^&*()_+|~-=`{}[]:";'<>?,./
Error Message: Syntax error on token(s), misplaced construct(s)
UPDATE: Added extra backslashes...fixed a lot of em, now gets error from ; onwards. Using Eclipse.
I think your problem is the "
public Pattern bsymbols = Pattern.compile("[-!$%^&*()_+|~=`{}\[\]:";'<>?,.\/]");
^
it is ending your string, so you should escape it. Also you need to remove the backslash before the slash, it is no special character.
public Pattern bsymbols = Pattern.compile("[-!$%^&*()_+|~=`{}\\[\\]:\";'<>?,./]");
OK, once more, you wanted to match the backslash, not to escape the slash, then we end up here:
public Pattern bsymbols = Pattern.compile("[-!$%^&*()_+|~=`{}\\[\\]:\";'<>?,.\\\\/]");
now it is the same answer, than jdb's, so +1 to him for being quicker.
How about that?
Pattern.compile("[-!$%^&*()_+|~=`{}\\[\\]:\";'<>?,.\\\\/]");
In a character class, only [ and ] have special meaning, so you need to escape them. Plus in Java, you need to escape with an extra backslash. That's the problem specifically with Java. So, you need to use \\[ and \\]. And yes, you need to escape " with single backslash, in a string literal.
Apart from that, a hyphen when used somewhere in the middle, has also a special meaning. If you want to match a hyphen, you need to use it at the ends.
Rest of the characters, don't need to be escaped. They are just ordinary characters.
So, your pattern should be like this: -
Pattern bsymbols = Pattern.compile("[-!$%^&*()_+|~=`{}\\[\\]:\";'<>?,./]");
And if you want to match backslash (\) also, then use this: -
Pattern bsymbols = Pattern.compile("[-!$%^&*()_+|~=`{}\\[\\]:\";'<>?,.\\\\/]");

Replace all with a string having regex wild chars [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
java String.replaceAll without regex
I have a string and I need to replace some parts of it.
The replacement text contains regex wild chars though. Example:
String target = "Something * to do in ('AAA', 'BBB')";
String replacement = "Hello";
String originalText = "ABCDEFHGIJKLMN" + target + "ABCDEFHGIJKLMN";
System.out.println(originalText.replaceAll(target, replacement));
I get:
ABCDEFHGIJKLMNSomething * to do in ('AAA', 'BBB')ABCDEFHGIJKLMN
Why doesn't the replacement occur?
Because *, ( and ) are all meta-characters in regular expressions. Hence all of them need to be escaped. It looks like Java has a convenient method for this:
java.util.regex.Pattern.quote(target)
However, the better option might be, to just not use the regex-using replaceAll function but simply replace. Then you do not need to escape anything.
String.replaceAll() takes a regular expression and so it's trying to expand these metacharacters.
One approach is to escape these chars (e.g. \*).
Another would be to do the replacement yourself by using String.indexOf() and finding the start of the contained string. indexOf() doesn't take a regexp but rather a normal string.

Categories