Match "_<digit>string" wth a regular expression - java

I have a list of strings like
xxx_2pathway
xxx_6pathway
xxx_pathway
So I have a string followed by an underscore and "pathway". There may be a digit between the underscore and "pathway". How can I match and replace everything except xxx with a regular expression in Java?
This does not work:
pathnameRaw = pathnameRaw.replace("_\\dpathway","");

Your regex is almost fine. Since the digit is optional, add a ? at the end of \\d.
Also the replace method does not use regex. Use replaceAll instead.
See it

"_[0-9]?pathway"

Related

Add Dash to Java Regex

I am trying to modify an existing Regex expression being pulled in from a properties file from a Java program that someone else built.
The current Regex expression used to match an email address is -
RR.emailRegex=^[a-zA-Z0-9_\\.]+#[a-zA-Z0-9_]+\\.[a-zA-Z0-9_]+$
That matches email addresses such as abc.xyz#example.com, but now some email addresses have dashes in them such as abc-def.xyz#example.com and those are failing the Regex pattern match.
What would my new Regex expression be to add the dash to that regular expression match or is there a better way to represent that?
Basing on the regex you are using, you can add the dash into your character class:
RR.emailRegex=^[a-zA-Z0-9_\\.]+#[a-zA-Z0-9_]+\\.[a-zA-Z0-9_]+$
add
RR.emailRegex=^[a-zA-Z0-9_\\.-]+#[a-zA-Z0-9_-]+\\.[a-zA-Z0-9_-]+$
Btw, you can shorten your regex like this:
RR.emailRegex=^[\\w.-]+#[\\w-]+\\.[\\w-]+$
Anyway, I would use Apache EmailValidator instead like this:
if (EmailValidator.getInstance().isValid(email)) ....
Meaning of - inside a character class is different than used elsewhere. Inside character class - denotes range. e.g. 0-9. If you want to include -, write it in beginning or ending of character class like [-0-9] or [0-9-].
You also don't need to escape . inside character class because it is treated as . literally inside character class.
Your regex can be simplified further. \w denotes [A-Za-z0-9_]. So you can use
^[-\w.]+#[\w]+\.[\w]+$
In Java, this can be written as
^[-\\w.]+#[\\w]+\\.[\\w]+$
^[a-zA-Z0-9_\\.\\-]+#[a-zA-Z0-9_]+\\.[a-zA-Z0-9_]+$
Should solve your problem. In regex you need to escape anything that has meaning in the Regex engine (eg. -, ?, *, etc.).
The correct Regex fix is below.
OLD Regex Expression
^[a-zA-Z0-9_\\.]+#[a-zA-Z0-9_]+\\.[a-zA-Z0-9_]+$
NEW Regex Expression
^[a-zA-Z0-9_.+-]+#[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$
Actually I read this post it covers all special cases, so the best one that's work correctly with java is
String pattern ="(?:[a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+)*|\"(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21\\x23-\\x5b\\x5d-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])*\")#(?:(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?\\.)+[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?|\\[(?:(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9]))\\.){3}(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])|[a-zA-Z0-9-]*[a-zA-Z0-9]:(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21-\\x5a\\x53-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])+)\\])";

Word that matches ^.*(?=.*\\d)(?=.*[a-zA-Z])(?=.*[!##$%^&]).*$

I am totally confused right now.
What is a word that matches: ^.*(?=.*\\d)(?=.*[a-zA-Z])(?=.*[!##$%^&]).*$
I tried at Regex 101 this 1Test#!. However that does not work.
I really appreciate your input!
What happens is that your regex seems to be in Java-flavor (Note the \\d)
that is why you have to convert it to work with regex101 which does not work with jave (only works with php, phyton, javascript)
see converted regex:
^.*(?=.*\d)(?=.*[a-zA-Z])(?=.*[!##$%^&]).*$
which will match your string 1Test#!. Demo here: http://regex101.com/r/gE3iQ9
You just want something that matches that regex?
Here:
a1a!
This pattern matches
\dTest#!
if u want a pattern which matches 1Test#! try this pattern
^.(?=.\d)(?=.[a-zA-Z])(?=.[!##$%^&]).*$
Your java string ^.*(?=.*\\d)(?=.*[a-zA-Z])(?=.*[!##$%^&]).*$ encodes the regexp expression ^.*(?=.*\d)(?=.*[a-zA-Z])(?=.*[!##$%^&]).*$.
This is because the \ is an escape sequence.
The latter matches the string you specified.
If your original string was a regexp, rather than a java string, it would match strings such as \dTest#!
Also you should consider removing the first .*, doing so would make the regexp more efficient. The reason is that regexp's by default are greedy. So it will start by matching the whole string to the initial .*, the lookahead will then fail. The regexp will backtrack, matchine the first .* to all but the last character, and will fail all but one of the loohaheads. This will proceed until it hits a point where the different lookaheads succeed. Dropping the first .*, putting the lookahead immidiately after the start of string anchor, will avoid this problem, and in this case the set of strings matched will be the same.

Splitting a string in java on more than one symbol

I want to split a string when following of the symbols encounter "+,-,*,/,="
I am using split function but this function can take only one argument.Moreover it is not working on "+".
I am using following code:-
Stringname.split("Symbol");
Thanks.
String.split takes a regular expression as argument.
This means you can alternate whatever symbol or text abstraction in one parameter in order to split your String.
See documentation here.
Here's an example in your case:
String toSplit = "a+b-c*d/e=f";
String[] splitted = toSplit.split("[-+*/=]");
for (String split: splitted) {
System.out.println(split);
}
Output:
a
b
c
d
e
f
Notes:
Reserved characters for Patterns must be double-escaped with \\. Edit: Not needed here.
The [] brackets in the pattern indicate a character class.
More on Patterns here.
You can use a regular expression:
String[] tokens = input.split("[+*/=-]");
Note: - should be placed in first or last position to make sure it is not considered as a range separator.
You need Regular Expression. Addionaly you need the regex OR operator:
String[]tokens = Stringname.split("\\+|\\-|\\*|\\/|\\=");
For that, you need to use an appropriate regex statement. Most of the symbols you listed are reserved in regex, so you'll have to escape them with \.
A very baseline expression would be \+|\-|\\|\*|\=. Relatively easy to understand, each symbol you want is escaped with \, and each symbol is separated by the | (or) symbol. If, for example, you wanted to add ^ as well, all you would need to do is append |\^ to that statement.
For testing and quick expressions, I like to use www.regexpal.com

How to replace brackets in strings

I have a list of strings that contains tokens.
Token is:
{ARG:token_name}.
I also have hash map of tokens, where key is the token and value is the value I want to substitute the token with.
When I use "replaceAll" method I get error:
java.util.regex.PatternSyntaxException: Illegal repetition
My code is something like this:
myStr.replaceAll(valueFromHashMap , "X");
and valueFromHashMap contains { and }.
I get this hashmap as a parameter.
String.replaceAll() works on regexps. {n,m} is usually repetition in regexps.
Try to use \\{ and \\} if you want to match literal brackets.
So replacing all opening brackets by X works that way:
myString.replaceAll("\\{", "X");
See here to read about regular expressions (regexps) and why { and } are special characters that have to be escaped when using regexps.
As others already said, { is a special character used in the pattern (} too).
You have to escape it to avoid any confusion.
Escaping those manually can be dangerous (you might omit one and make your pattern go completely wrong) and tedious (if you have a lot of special characters).
The best way to deal with this is to use Pattern.quote()
Related issues:
How to escape a square bracket for Pattern compilation
How to escape text for regular expression in Java
Resources:
Oracle.com - JavaSE tutorial - Regular Expressions
replaceAll() takes a regular expression as a parameter, and { is a special character in regular expressions. In order for the regex to treat it as a regular character, it must be escaped by a \, which must be escaped again by another \ in order for Java to accept it. So you must use \\{.
You can remove the curly brackets with .replaceAll() in a line with square brackets
String newString = originalString.replaceAll("[{}]", "X")
eg: newString = "ARG:token_name"
if you want to further separate newString to key and value, you can use .split()
String[] arrayString = newString.split(":")
With arrayString, you can use it for your HashMap with .put(), arrayString[0] and arrayString[1]

Java regular expression: how to include '-'

I am using this pattern and matching a string.
String s = "//name:value /name:value";
if (s.matches("(//?\\s*\\w+:\\w+\\s*)+")) {
// it fits
}
This works properly.
But if I want to have a string like "/name-or-address:value/name-or-address:value" which has this '-' in second part, it doesn't work.
I am using \w to match A-Za-z_, but how can I include - in that?
Use [\w-] to combine both \w and -.
Note that - should always be at the beginning or end of a character class, otherwise it will be interpreted as defining a range of characters (for instance, [a-z] is the range of characters from a to z, whereas [az-] is the three characters a,z,and-).
I don't know if it answers your question but why not replacing \w+ with (\w|-)+ or [\w-]+ ?
[-\w] (Or in a string, [-\\w].)
How about
if (s.matches("/(/|\\w|-|:\\w)+")) {

Categories