I want to know if there is a way to check if a given string contains only combination of alphabets and numbers and nothing else.
for just alphanumeric i can use http://commons.apache.org/proper/commons-lang/javadocs/api-2.6/org/apache/commons/lang/StringUtils.html
for just numbers i can use Interger.parse or some regular expression.
But is there some library which does the combined check. I googled but didnt came across anything. Everywhere it's done separately.
Alphanumeric means "Only letters and/or digits"
StringUtils.isAlphanumeric(String str) does what you want
someString.matches("[A-Za-z0-9]+")
[a-zA-Z0-9]+
This will check numbers and alphabets.
You can use a regular expression to match your String:
someString.matches("[a-zA-Z0-9]+");
This matches for at least one (+, if empty string is valid too, use * instead) character that can be either a digit from 0 to 9, or an uppercase or lowercase letter (no unicode letters, just A-Z).
Related
Problem Description
I have string "Վիկտոր1 Ափոյան2" using regular expression I want to get first letters of both strings. So as a result I will have "ՎԱ" As string is unicode I'm musing following regex:
"(\\p{L})\\p{L}*\\s(\\p{L})\\p{L}*
Which works fine if string does not contains numbers "1", "2", to get result I also tried with following regex:
"(\\p{L}\\p{N})\\p{L}\\p{N}*\\s(\\p{L}\\p{N})\\p{L}\\p{N}*
but this does not work correct.
Is there a something like "\\p{LN}" which will check for Unicode letters and numbers at the same time, or anyone knows how I can solve this issue?
Is there a something like "\p{LN}" which will check for Unicode letters and numbers at the same time
Use a character class [\p{L}\p{N}] that will match either a Unicode letter or a digit.
Alternatively use \p{Alnum} with a Pattern.UNICODE_CHARACTER_CLASS flag (or prepend the pattern with (?U)).
I want to make a regular expression in Java, with the next criteria:
Length: 10 characters exactly. Not more, not less.
Can accept any character between A-Z (only uppercase letters) and between digits 0-9.
Can accept only one dash character '-' in any position. It cannot accept any other characters, strictly only one dash.
EXAMPLES:
ABCD-12345
F-01234GHK
09-PL89GG5
LJ8U9N3-Y2
PLN86D4V-1
I have been making tries with regex of my own invention, some regular expressions that are close to the result I want, but with no success.
Do I have to combine two regular expressions?
Please, help me to get rid of this issue.... and thanks in advance!!!
I think you need lookahead (which is a way of combining two regular expressions, sort of).
^(?!.*-.*-)[A-Z0-9-]{10}$
The second part will match 10 characters that are A-Z, 0-9, or dash; the first part is negative lookahead that will reject a pattern that has two dashes in it.
You can use this:
^(?![^-]*+-[^-]*+-)[A-Z0-9-]{10}$
Note: If you use the matches method you can remove anchors.
I'm trying to see if a input only contains capital letters, numbers and a period in regex. What would the regex pattern be for this in Java?
Is there any guides on how I can build this regex, even some online tools?
Also is it possible to check length of string is no more than 50 using regex?
This is the Unicode answer:
^[\p{Lu}\p{Nd}.]{0,50}$
From regular-expressions.info
\p{Lu} or \p{Uppercase_Letter}: an uppercase letter that has a lowercase variant.
\p{Nd} or \p{Decimal_Digit_Number}: a digit zero through nine in any script except ideographic scripts.
^ and $ is the start and the end of the string
Regex pattern:
Pattern.compile("^[A-Z\\d.]*$")
To check the length of a string:
Pattern.compile("^.{0,50}$")
Both combined:
Pattern.compile("^[A-Z\\d.]{0,50}$")
Although I wouldn't use regular expressions to check for length if I were you, just call .length() on the string.
This website is really handy for building and testing and regular expressions
Regular expressions in Java have a lot in common with other languages when it comes to the simple syntax, with some predefined character classes that add more than you'd find in Perl for example. The Java API docs on Pattern show the various patterns that are supported. A friendlier introduction to regexes in Java is http://www.regular-expressions.info/java.html.
Some very quick Googling shows there are many tools online for testing Java regular expressions against input strings. Here is one.
To check for the type of input you are interested in, the following regex should work:
^[A-Z0-9.]{,50}$
Broken down, this is saying:
^: start matching from the start of the input; do not allow the first character(s) to be skipped
[]: match one of the characters in this range
A-Z: within a range, - means to accept all values between the first and last character inclusive, so in this case all characters from A to Z.
0-9: add to the previous range all digits
.: periods are special in regexes, but all special characters become simple again within a character class ([])
{,50}: require (or 0) matches up to 50 of the character class just defined.
$: the match must reach the end of the input; do not allow the last character(s) to be skipped
This returns true for strings, containing only 50 characters that can be numbers, capital letters or a dot.
string.matches("[0-9A-Z\\.]{0,50}")
In response to what tools you can use, I prefer Regex Coach
I tried using this pattern
^[A-z]*[A-z,-, ]*[A-z]*
To match against a string that starts with multiple alpha characters (a-z) followed by multiple hyphens or spaces and ends with alpha characters, eg:
Azasdas- - sa-as
But it does not work.
Try ^[A-Za-z][A-Za-z -]*[A-Za-z]$
^ indicates that the word should start with alphabets (A-Z or a-z) and then followed by any number of alphabets or hyphens. And then end with alphabets denoted by $ .
Also, you should not be using A-z because this will include unintended characters from ASCII range 91 to 96. See this table
Don't use ',' (comma)
^[A-z]*[A-z- ]*[A-z]*
You don't want the commas, in a character range you also need to specify [A-Za-z\- ] because the ASCII for A-Z and a-z aren't contiguous. You're missing some allowable spaces, and your last expression needs to account for the hypen.
You need something closer to this:
^([A-Za-z]*)-\s*([A-Za-z][A-Za-z -]*)([A-Za-z-]*)$
Depending on how you actually want to break things up. Without knowing the context behind the "chunks", it may or may not just be easier to split it apart on hyphens.
Edit
Actually, it's more like:
^([A-Za-z]*)([- ]*)([A-Za-z-]*)$
This is a word, followed by arbitrary spaces and hyphens, followed by a word that may contain a hyphen.
The currently accepted answer (^[A-Za-z][A-Za-z-]*[A-Za-z]$) will only match strings that are at least two characters long--for example, it will match the string "AB", but not just "A" or "B". Compare that to this regex:
^[A-Za-z]+([ -]+[A-Za-z]+)*$
By grouping the [ -]+ and the second [A-Za-z]+ together I'm saying, if there are any spaces and/or hyphens, they must be followed by more letters. The * quantifier on the group makes it optional, so "A" will match, while still meeting the requirement that the string start and end with a letter.
An example of how the Strings may look:
TADE000177
TADE007,daFG
TADE0277 DFDFG
It's a little unclear what you want.
If you mean four capital letters from A to Z, followed by at least one digit in 0-9 you could try this:
"^[A-Z]{4}[0-9]+"
If instead of capital letters you want to allow any character except newline change [A-Z] to ..
If you want to also allow zero digits change the + to a *.
Exactly four characters followed by 1 or more digits: [A-Z]{4}\d+
Remember to escape the backslash if you put it in a string literal.
Breakdown:
[A-Z]…: An upper case letter, equivalent to \p{Upper}
To also include lower case letters, you could instead use [A-Za-z] or \p{Alpha}
…{4}… exactly 4 times
…\d…+ a digit
…+ 1 or more times
To allow 0 digits, you could change to *.
If i understood correctly what you're asking for you can try: .{4}\d*
^\w{4}.*$
Matches a string starting with 4 characters followed by any number of any other charcters.
Your examples include spaces and punctuation, if you know exactly which characters are allowed then you might want to use this pattern.
^\w{4}[A-z\d<other known characters go here>]*$
Remember to remove the < and > too :)