form java regex to php regex - java

I have a regular expression in Java: [^a-zA-Z0-9.-_]
How to form this regular expression from java to php?

In php (PCRE) this regular expression looks like
[^a-zA-Z0-9.-_]
Yep, it's exactly the same

it is the same for this particular regex.
But you can make shorter with:
[^\w.-]
and don't forget that the - character must be placed at the last position in a character class

It's exactly the same but you might need to put delimiters around it, e.g. parenthesis:
([^a-zA-Z0-9._-])
See the little difference moving the minus to the end. That is because [.-_] matches ./0...9:;<=>?#A...Z[\]^_. I guess you're not looking for the negation of this as you had already 0-9 and A-Z covered.

Related

How can I make this into a Java regex?

I used regex101 to make my expression, and it looks like this using their symbols
\d+ [+-\/*] \d*
Basically I want a user to enter like 123 + 123 but the entire statement is one string with exactly one space after the first number and one space after the operator
The above expression works, but It doesn't convert the same into Java.
I thought these symbols were universal, but I guess not. Any ideas how to convert this to the proper syntax?
Regular expressions are not universal.
In general,
no two regular expression systems are the same.
Java does not have regular expressions.
Some Java classes support regular expressions.
The Pattern class defines the regular expressions that are used by some Java classes including Matcher which seems likely to be the class you are using.
As already identified in the comments,
\ is the escape-the-next-character character in Java.
If you want to represent \ in a String,
you must use \\.
For example,
\d in a regular expression must be written \\d in a Java String.
You can simply use groups () and design a RegEx as you wish. This RegEx might be one way to do so:
((\d+\s)(\+|\-)(\s\d+))
It has four groups, and you can simply call the entire input using $1:
You can also escape \ those required language-based chars.

Match Lua multiline strings and comments with Regex

I have a Lua editor in which I implemented syntax highlighting. I use regexes to match expressions like strings, comments, tokens, numbers, etc of Lua. The whole thing is made in Java and uses Java regexes. I had trouble with two things:
Multiline strings - Lua multiline brackets start and end with double square brackets [[ Everything between is the string, there can even be nested multiline strings. You can see what I made here, the regex is \[\[((?>[^\[\[\]\]]|(?R))*\]\]) and it works. It's similar to what you can see on this page under the match balanced constructs section. It finds expressions with equal amounts of [[ and ]] The thing is, recursion is not supported by Java regex engine. How can I replace it with something supported?
Multiline comments - Lua multiline comments start with --[====[ and end with ]====]. It ends only if there is as much equal signs as the opening bracket. There can be anywhere between 0 and infinite equal signs. I made this regex --\[\[((.|\n)*?)\]\] but it only works for the --[[ comment ]] pattern and do not support this --[==[ comment ]==]. Maybe I could do something like counting number of matches of equal signs at the opening then match the same the number for the closing tag. Is this possible in java regex? How?
Try this
--\[(=*)\[(.|\n)*?\]\1\]
Multiline string literals are absolutely the same but without leading --:
\[((=*)\[(.|\n)*?)\]\2\]

Add Dash to Java Regex

I am trying to modify an existing Regex expression being pulled in from a properties file from a Java program that someone else built.
The current Regex expression used to match an email address is -
RR.emailRegex=^[a-zA-Z0-9_\\.]+#[a-zA-Z0-9_]+\\.[a-zA-Z0-9_]+$
That matches email addresses such as abc.xyz#example.com, but now some email addresses have dashes in them such as abc-def.xyz#example.com and those are failing the Regex pattern match.
What would my new Regex expression be to add the dash to that regular expression match or is there a better way to represent that?
Basing on the regex you are using, you can add the dash into your character class:
RR.emailRegex=^[a-zA-Z0-9_\\.]+#[a-zA-Z0-9_]+\\.[a-zA-Z0-9_]+$
add
RR.emailRegex=^[a-zA-Z0-9_\\.-]+#[a-zA-Z0-9_-]+\\.[a-zA-Z0-9_-]+$
Btw, you can shorten your regex like this:
RR.emailRegex=^[\\w.-]+#[\\w-]+\\.[\\w-]+$
Anyway, I would use Apache EmailValidator instead like this:
if (EmailValidator.getInstance().isValid(email)) ....
Meaning of - inside a character class is different than used elsewhere. Inside character class - denotes range. e.g. 0-9. If you want to include -, write it in beginning or ending of character class like [-0-9] or [0-9-].
You also don't need to escape . inside character class because it is treated as . literally inside character class.
Your regex can be simplified further. \w denotes [A-Za-z0-9_]. So you can use
^[-\w.]+#[\w]+\.[\w]+$
In Java, this can be written as
^[-\\w.]+#[\\w]+\\.[\\w]+$
^[a-zA-Z0-9_\\.\\-]+#[a-zA-Z0-9_]+\\.[a-zA-Z0-9_]+$
Should solve your problem. In regex you need to escape anything that has meaning in the Regex engine (eg. -, ?, *, etc.).
The correct Regex fix is below.
OLD Regex Expression
^[a-zA-Z0-9_\\.]+#[a-zA-Z0-9_]+\\.[a-zA-Z0-9_]+$
NEW Regex Expression
^[a-zA-Z0-9_.+-]+#[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$
Actually I read this post it covers all special cases, so the best one that's work correctly with java is
String pattern ="(?:[a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+)*|\"(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21\\x23-\\x5b\\x5d-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])*\")#(?:(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?\\.)+[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?|\\[(?:(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9]))\\.){3}(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])|[a-zA-Z0-9-]*[a-zA-Z0-9]:(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21-\\x5a\\x53-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])+)\\])";

Regular expression for format XXXXXXX_YZZZZ

I am trying to write a regular expression in java which will validate following format-
XXXXXXXX_YZZZZ
where
X – alphanumeric characters(8 characters)
Y - alpha character
Z - numeric characters
what I have tried for first part is - ^[a-zA-Z0-9 ]*$ but I am not getting how to go for second part.
Can any one tell me what will be the correct regex for required format ?
You forgot to specify the amount and the underscore I assume...
/^[a-z0-9]{8}_[a-z][0-9]{4}$/i
Look at JavaDoc, then you can translate your requirements to:
"^\\p{Alnum}{8}_\\p{Alpha}\\p{Digit}{4}$"
It uses predefined character classes, like you listed in your question.
How about this?
^[a-ZA-Z0-9]{8}\_[a-zA-Z][0-9]{4}$
You can also group the results:
^([a-ZA-Z0-9]{8})\_([a-zA-Z])([0-9]{4})$
so that you can address the X, Y and Z parts individually from the results.
Try this regex:
^[A-Za-z\d]{8}_[A-Za-z]\d{4}$
Your regex matches zero or more alphanumeric characters and/or whitespaces.
This is a good place to learn regex : http://www.regular-expressions.info
Try this regular expression
^[a-zA-Z0-9]{8}[_][a-zA-Z][0-9]{4}$
Try:
^[a-zA-Z0-9]{8}_[a-zA-Z][0-9]{4}$
Regexper is your friend here.
^[a-zA-Z0-9]{8}_[a-zA-Z][0-9]{4}$
In Java, you can use metacharacters to express regulars expressions :
"8abba778_a2012".matches("^\\w{8}_[a-z]\\d{4}$");
[EDIT] : According #Jon Dvorak, I am correcting my answer. In fact, \w is too generous and also applies to the underscore character _. The correct answer :
"8abba778_a2012".matches("^[a-zA-Z0-9]{8}_[a-z]\\d{4}$");

Simple regex required

I've never used regexes in my life and by jove it looks like a deep pool to dive into. Anyway,
I need a regex for this pattern (AN is alphanumeric (a-z or 0-9), N is numeric (0-9) and A is alphabetic (a-z)):
AN,AN,AN,AN,AN,N,N,N,N,N,N,AN,AN,AN,A,A
That's five AN's, followed by six N's, followed by three AN's, followed finally by two A's.
If it makes a difference, the language I'm using is Java.
[a-z0-9]{5}[0-9]{6}[a-z0-9]{3}[a-z]{2}
should work in most RE dialects for the tasks as you specified it -- most of them will also support abbreviations such as \d (digit) in lieu of [0-9] (but if alphabetics need to be lowercase, as you appear to be requesting, you'll probably need to spell out the a-z parts).
Replace each AN by [a-z0-9], each N by [0-9], and each A by [a-z].
30 seconds in Expresso:
[a-zA-Z0-9]{5}[0-9]{6}[a-zA-Z0-9]{3}[0-9]{2}
Case insensitive, but you can probably define that in Java instead of the regex.
For the example you posted, the following should work fine.
(([A-Za-z\d])*,){5}+(([\d])*,){6}+(([A-Za-z\d])*,){3}+([\d])*,[\d]*
In Java you should be able use it like this:
boolean foundMatch = subjectString.matches("(([A-Za-z\\d])*,){5}+(([\\d])*,){6}+(([A-Za-z\\d])*,){3}+([\\d])*,[\\d]*");
I used, this tool to help in learning RegEx, it also make this really easy.
http://www.regexbuddy.com/
Try looking at some simple java regex tutorials such as this
They'll tell you how you form regular expressions and also how to use it in java.
This should match the pattern you request.
[a-z0-9]{5}[0-9]{6}[a-z0-9]{3}[a-z]{2}
In addition, you could add Beginning of String / End of String matches, if your string match should fail if any other chars are in it:
^[a-z0-9]{5}[0-9]{6}[a-z0-9]{3}[a-z]{2}$

Categories