Add Dash to Java Regex - java

I am trying to modify an existing Regex expression being pulled in from a properties file from a Java program that someone else built.
The current Regex expression used to match an email address is -
RR.emailRegex=^[a-zA-Z0-9_\\.]+#[a-zA-Z0-9_]+\\.[a-zA-Z0-9_]+$
That matches email addresses such as abc.xyz#example.com, but now some email addresses have dashes in them such as abc-def.xyz#example.com and those are failing the Regex pattern match.
What would my new Regex expression be to add the dash to that regular expression match or is there a better way to represent that?

Basing on the regex you are using, you can add the dash into your character class:
RR.emailRegex=^[a-zA-Z0-9_\\.]+#[a-zA-Z0-9_]+\\.[a-zA-Z0-9_]+$
add
RR.emailRegex=^[a-zA-Z0-9_\\.-]+#[a-zA-Z0-9_-]+\\.[a-zA-Z0-9_-]+$
Btw, you can shorten your regex like this:
RR.emailRegex=^[\\w.-]+#[\\w-]+\\.[\\w-]+$
Anyway, I would use Apache EmailValidator instead like this:
if (EmailValidator.getInstance().isValid(email)) ....

Meaning of - inside a character class is different than used elsewhere. Inside character class - denotes range. e.g. 0-9. If you want to include -, write it in beginning or ending of character class like [-0-9] or [0-9-].
You also don't need to escape . inside character class because it is treated as . literally inside character class.
Your regex can be simplified further. \w denotes [A-Za-z0-9_]. So you can use
^[-\w.]+#[\w]+\.[\w]+$
In Java, this can be written as
^[-\\w.]+#[\\w]+\\.[\\w]+$

^[a-zA-Z0-9_\\.\\-]+#[a-zA-Z0-9_]+\\.[a-zA-Z0-9_]+$
Should solve your problem. In regex you need to escape anything that has meaning in the Regex engine (eg. -, ?, *, etc.).

The correct Regex fix is below.
OLD Regex Expression
^[a-zA-Z0-9_\\.]+#[a-zA-Z0-9_]+\\.[a-zA-Z0-9_]+$
NEW Regex Expression
^[a-zA-Z0-9_.+-]+#[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$

Actually I read this post it covers all special cases, so the best one that's work correctly with java is
String pattern ="(?:[a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+)*|\"(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21\\x23-\\x5b\\x5d-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])*\")#(?:(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?\\.)+[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?|\\[(?:(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9]))\\.){3}(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])|[a-zA-Z0-9-]*[a-zA-Z0-9]:(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21-\\x5a\\x53-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])+)\\])";

Related

I want to Capture a alphanumeric group without underscore

I want to Capture an alphanumeric group in regex such that it does not capture starting underscore. For example _reverse(abc) should return reverse(. I am using (?<name>\w+) but it return _reverse(.
You can try this,
[^a-zA-Z0-9()\\s+]
The output will be reverse(abc)
You can specify characters explicitly, e.g.:
[a-zA-Z0-9]+
From what you are showing, I assume you want to strip underscores and content behind the opening parentheses.
Basically, that should work with a regex like this:
"_([a-zA-Z0-9]+\()"
this can be used in conjunction with a Matcher to extract all capturing groups (in this case, [a-zA-Z0-9]+\() and return them.
Note that you can find almost all the help you need with Regular Expressions on utility sites like RegEx 101 and RegEx Per, the latter being a nice visualizer but only working with javaScript-like expressions.
Also, RegEx 101 contains a Regex Debugger to help avoid dangerous regular expressions

How can I make this into a Java regex?

I used regex101 to make my expression, and it looks like this using their symbols
\d+ [+-\/*] \d*
Basically I want a user to enter like 123 + 123 but the entire statement is one string with exactly one space after the first number and one space after the operator
The above expression works, but It doesn't convert the same into Java.
I thought these symbols were universal, but I guess not. Any ideas how to convert this to the proper syntax?
Regular expressions are not universal.
In general,
no two regular expression systems are the same.
Java does not have regular expressions.
Some Java classes support regular expressions.
The Pattern class defines the regular expressions that are used by some Java classes including Matcher which seems likely to be the class you are using.
As already identified in the comments,
\ is the escape-the-next-character character in Java.
If you want to represent \ in a String,
you must use \\.
For example,
\d in a regular expression must be written \\d in a Java String.
You can simply use groups () and design a RegEx as you wish. This RegEx might be one way to do so:
((\d+\s)(\+|\-)(\s\d+))
It has four groups, and you can simply call the entire input using $1:
You can also escape \ those required language-based chars.

Regular expression to return results that do not match selection

I work on a product that provides a Java API to extend it.
The API provides a function which
takes a Perl regular expression and
returns a list of matching files.
I want to filter the list to remove all files that end in .xml, .xsl and .cfg; basically the opposite of .*(\.xml|\.xsl|\.cfg).
I have been searching but I haven't been able to get anything to work yet.
I tried .*(?!\.cfg) and ^((?!cfg).)*$ and \.(?!cfg$|?!xml$|?!xsl$).
I don't know if I am on the right track or not.
Note
I know the regex systems are similar, but I can't get a Java regex working either.
You may use
^(?!.*\.(x[ms]l|cfg)$).+
See the regex demo
Details:
^ - start of a string
(?!.*\.(x[ms]l|cfg)$) - a negative lookahead that fails the match if any 0+ chars other than line break chars (.*) are followed with xml, xsl or cfg ((x[ms]l|cfg)) at the end of the string ($)
.+ - any 1 or more chars other than linebreak chars. Might be omitted if the entire string match is not required (in some tools it is required though).
You need something like this, which matches only if the end of the string isn't preceded by a dot and one of the three unwanted types
/(?<!\.(?:xml|xsl|cfg))\z/

Need a regular expression for field which should allow special characters, alphanumeric characters, and spaces

I am using the following regex:
[a-zA-Z0-9-#.()/%&\\s]{0,19}.
The requirement for the field is it should allow any thing and the field size should be 19.
Let me know if any corrections.Any help is appreciated.
You simply need to escape the special characters. Try:
[a-zA-Z0-9\-#\.\(\)\/%&\s]{0,19}
You can test your regular expressions on http://rubular.com/
Your regex is incorrect in at least one way - if you're considering a hyphen to be a "special character", then you should put it at the beginning or end of the range. So: [a-zA-Z0-9#.()/%&\s-]{0,19}.
Characters that are "special" within the context of the regex itself are often not parsed if they're inside a range. So you're fine with ., ( and ). But check your parser to make sure that it understands what \s means. It might be simpler just to put a space.
Also, if your regex parser tends to delimit the regex with slashes, then you may have to escape the slash in the middle of the range: [a-zA-Z0-9#.()\/%&\s-]{0,19}.
Just escape the dash - or put it at the begining or at the end of the character class:
[a-zA-Z0-9\\-#.()/%&\\s]{0,19}
or
[-a-zA-Z0-9#.()/%&\\s]{0,19}
or
[a-zA-Z0-9#.()/%&\\s-]{0,19}

Java string.replaceFirst problems

I'm trying to replace a part of a string. The part contains some special characters:
#L(inches)=24#
I know replaceFirst is regex driven but I can't seem to create a regular expression that matches this part in a string, any ideas?
#.*?#
This should match the entire String above.

Categories