How to match the first character in a String with a regexp? - java

I need a regular expression to evaluate if the first character of a word is a lowercase letter or not.
I have this java code: Character.toString(charcter).matches("[a-z?]")
For example if I have those words the result would be:
a13 => true
B54 => false
&32 => false
I want to match only one letter and I don't know if I need to use "?", "." or "{1}" after or inside "[a-z]"

There is a built in way to do this without regexes.
Character.isLowerCase(string.charAt(0))

Please use this for your needs: /^[a-z]/

You want to match if there's exactly one lowercase letter. As #Luiggi Medonza stated, you really do/should not need Regular Expressions for this, but if you want to use them, you most likely want this pattern:
[a-z]{1}
What ? does is an optional match. You want a strict match of length 1, so you need {1}.
#Ted Hopp mentioned that you don't need the {1}. Your entire match should look like this:
entire_string.matches("^[a-z].+$")
Again, using built-in string methods will be much faster/better to use.

Here I got similar requirement like in a string first character should alphabet from a-z or A-Z. than the user can type anything like number or some limited symbols.
Solution
public static boolean designationValidate(String n) {
int l = n.length();
if (l >= 4) {
Pattern pattern = Pattern.compile("^[a-zA-Z][a-zA-Z0-9-() ]*$");
Matcher matcher = pattern.matcher(n);
return (matcher.find() && matcher.group().equals(n));
} else
return false;
}
in above example I am validation minimum character should more than 3 length and start with alphabet. If you want any other symbols you can enter there.
The method will return true if expressions match otherwise return false.
May this will helpful for you.

Related

RegExp pattern for a String which contain 0 and 4-9(4 ,5,6,7,8,9)

I am dealing with a string. Use-Case is I don't want a String which has number any digit of 4 to 9 and 0.
Example:-
ABC0123-> Not Valid.
XYZ002456789->Not Valid.
ABC123->Valid
ABC1->Valid
I have tried below pattern but not got success in it.
String pattern = "^[0,4-9]+$";
if(str.matches(pattern)){
//do something.
}
First, remove the comma from the character class. You're not looking for commas.
Since you're disallowing, don't anchor the expression, allow the match anywhere in the string. In fact, matches anchors the expression for you, so we have to intentionally allow characters before and after the disallowed character class:
String pattern = ".*[04-9].*";
if(str.matches(pattern)){
// disallow
}
Live Example
Alternately, you can avoid having those .* in there by using Pattern.compile and then using the resulting Pattern instead of matches, since it won't automatically anchor the pattern like matches does.
It is much more easier to match those that contains 4-9 and 0 than to match those that don't. So you should just write a regex like this:
[4-90]
And call find, then invert the result:
if (!Pattern.compile("[4-90]").matcher(someString).find()) {
// ...
}
Another option could be to use a negated character class and add what you don't want to match. In this case you could add 0 and a range from 4-9 and if you don't want to match a carriage return or a newline you could add those as well.
^[^04-9\\r\\n]+$
Note that if you add the comma to the character class that it would mean a comma literally.
Regex demo | Java demo
String pattern = "^[^04-9\\r\\n]+$";
if(str.matches(pattern)){
//do something.
}

Regex for finding between 1 and 3 character in a string

I am trying to write a regex which should return true, if [A-Za-z] is occured between 1 and 3, but I am not able to do this
public static void main(String[] args) {
String regex = "(?:([A-Za-z]*){3}).*";
String regex1 = "(?=((([A-Za-z]){1}){1,3})).*";
Pattern pattern = Pattern.compile(regex);
System.out.println(pattern.matcher("AD1CDD").find());
}
Note: for consecutive 3 characters I am able to write it, but what I want to achieve is the occurrence should be between 1 and 3 only for the entire string. If there are 4 characters, it should return false. I have used look-ahead to achieve this
If I understand your question correctly, you want to check if
1 to 3 characters of the range [a-zA-Z] are in the string
Any other character can occur arbitrary often?
First of all, just counting the characters and not using a regular expression is more efficient, as this is not a regular language problem, but a trivial counting problem. There is nothing wrong with using a for loop for this problem (except that interpreters such as Python and R can be fairly slow).
Nevertheless, you can (ab-) use extended regular expressions:
^([^A-Za-z]*[A-Za-z]){1,3}[^A-Za-z]*$
This is fairly straightforward, once you also model the "other" characters. And that is what you should do to define a pattern: model all accepted strings (i.e. the entire "language"), not only those characters you want to find.
Alternatively, you can "findAll" matches of ([A-Za-z]), and look at the length of the result. This may be more convenient if you also need the actual characters.
The for loop would look something like this:
public static boolean containsOneToThreeAlphabetic(String str) {
int matched = 0;
for(int i=0; i<str.length; i++) {
char c = str.charAt(i);
if ((c>='A' && c<='Z') || (c>='a' && c<='z')) matched++;
}
return matched >=1 && matched <= 3;
}
This is straightforward, readable, extensible, and efficient (in compiled languages). You can also add a if (matched>=4) return false; (or break) to stop early.
Please, stop playing with regex, you'll complicate not only your own life, but the life of the people, who have to handle your code in the future. Choose a simpler approach, find all [A-Za-z]+ strings, put them into the list, then check every string, if the length is within 1 and 3 or beyond that.
Regex
/([A-Za-z])(?=(?:.*\1){3})/s
Looking for a char and for 3 repetitions of it. So if it matches there are 4 or more equal chars present.

Java Regex ? (expr){num} confusion?

I'm trying to identify strings which contain exactly one integer.
That is exactly one string of contiguous digits e.g. "1234" (no dots, no commas).
So I thought this should do it: (This is with the Java String Escapes included):
(\\d+){1,}
So the "\d+" correctly a string of contiguous digits. (right?)
I included this expression as a sub-expression within "(" and ")" and then I'm trying to say "only one of these sub-expressions.
Here's the result of ( matcher.find() ) of checking various strings:
(note the regex from now on is'raw' here - NOT Java String Escaped).
Pattern:(\d+){1,}
Input String Result
1 true
XX-1234 true
do-not-match-no-integers false
do-not-match-1234-567 true
do-not-match-123-456 true
It seems the '1' in the pattern is applying to the "+\d" string, rather than the number of those contiguous strings.
Because if I change the number from 1 to 4; I can see the result change to the following:
Pattern:(\d+){4,}
Input String Result
1 false
XX-1234 true
do-not-match-no-integers false
do-not-match-1234-567 true
do-not-match-123-456 false
What am I missing here ?
Out of interest - if I take off the "(" and ")" altogether - I'm getting a different result again
Pattern:\d+{4,}
Input String Result
1 true
XX-1234 true
do-not-match-no-integers false
do-not-match-1234-567 true
do-not-match-123-456 true
Matcher.find() will try to find a match inside the String. You should try Matcher.matches() instead to see if the pattern fits in all the string.
In this way, the pattern you need is \d+
EDIT:
Seems that I misunderstood the question. One way to find if the String has only one integer, using the same pattern is:
int matchCounter = 0;
while (Matcher.find() || matchCounter < 2){
matchCounter++;
}
return matchCounter == 1
This is the regex:
^[^\d]*\d+[^\d]*$
That's zero or more non digits, followed by a substring of digits and then zero or more non digits again until the end of the string. Here is the java code (with escaped slashes):
class MainClass {
public static void main(String[] args) {
String regex="^[^\\d]*\\d+[^\\d]*$";
System.out.println("1".matches(regex)); // true
System.out.println("XX-1234".matches(regex)); // true
System.out.println("XX-1234-YY".matches(regex)); // true
System.out.println("do-not-match-no-integers".matches(regex)); // false
System.out.println("do-not-match-1234-567".matches(regex)); // false
System.out.println("do-not-match-123-456".matches(regex)); // false
}
}
You can use the RegEx ^\D*?(\d+)\D*?$
^\D*? makes sure there is no digits between the start of your line and your first group
(\d+) matches your digits
\D*?$ makes sure there is no digits between the your first group and the end of your line
Demo.
So, for your Java String, it would be : ^\\D*?(\\d+)\\D*?$
I think you will have to make sure your regex considers the entire string, using ^ and $.
To do that, you could match zero or more non-digits, followed by 1 or more digits, and then zero or more non-digits.
The following should do the trick:
^[^\d]*(\d+)[^\d]*$
Here it is on regex101.com: https://regex101.com/r/CG0RiL/2
Edit: As pointed out by Veselin Davidov my regex isn't correct.
If i understand you right you want it only to say true when the entire String matches the pattern. yes?
Then you have to call matcher.matches();
Also i think your pattern must be just \d+.
If you have problem with regex i can recommend you https://regex101.com/ it explains you why it matches something and gives you a quick preview.
I use it every time i have to write regex.

Negative Lookaround Regex - Only one occurrence - Java

I am trying to find if a string contains only one occurrence of a word ,
e.g.
String : `jjdhfoobarfoo` , Regex : `foo` --> false
String : `wewwfobarfoo` , Regex : `foo` --> true
String : `jjfffoobarfo` , Regex : `foo` --> true
multiple foo's may happen anywhere in the string , so they can be non-consecutive,
I test the following regex matching in java with string foobarfoo, but it doesn't work and it returns true :
static boolean testRegEx(String str){
return str.matches(".*(foo)(?!.*foo).*");
}
I know this topic may seem duplicate , but I am surprised because when I use this regex : (foo)(?!.*foo).* it works !
Any idea why this happens ?
Use two anchored look-aheads:
static boolean testRegEx(String str){
return str.matches("^(?=.*foo)(?!.*foo.*foo.*$).*");
}
A couple of key points are that there is a negative look-ahead to check for 2 foo's that is anchored to start, and importantly containes an end of input.
If you want to check if a string contains another string exactly once, here are two possible solutions, (one with regex, one without)
static boolean containsRegexOnlyOnce(String string, String regex) {
Matcher matcher = Pattern.compile(regex).matcher(string);
return matcher.find() && !matcher.find();
}
static boolean containsOnlyOnce(String string, String substring) {
int index = string.indexOf(substring);
if (index != -1) {
return string.indexOf(substring, index + substring.length()) == -1;
}
return false;
}
All of them work fine. Here's a demo of your examples:
String str1 = "jjdhfoobarfoo";
String str2 = "wewwfobarfoo";
String str3 = "jjfffoobarfo";
String foo = "foo";
System.out.println(containsOnlyOnce(str1, foo)); // false
System.out.println(containsOnlyOnce(str2, foo)); // true
System.out.println(containsOnlyOnce(str3, foo)); // true
System.out.println(containsRegexOnlyOnce(str1, foo)); // false
System.out.println(containsRegexOnlyOnce(str2, foo)); // true
System.out.println(containsRegexOnlyOnce(str3, foo)); // true
You can use this pattern:
^(?>[^f]++|f(?!oo))*foo(?>[^f]++|f(?!oo))*$
It's a bit long but performant.
The same with the classical example of the ashdflasd string:
^(?>[^a]++|a(?!shdflasd))*ashdflasd(?>[^a]++|a(?!shdflasd))*$
details:
(?> # open an atomic group
[^f]++ # all characters but f, one or more times (possessive)
| # OR
f(?!oo) # f not followed by oo
)* # close the group, zero or more times
The possessive quantifier ++ is like a greedy quantifier + but doesn't allow backtracks.
The atomic group (?>..) is like a non capturing group (?:..) but doesn't allow backtracks too.
These features are used here for performances (memory and speed) but the subpattern can be replaced by:
(?:[^f]+|f(?!oo))*
The problem with your regex is that the first .* initially consumes the whole string, then backs off until it finds a spot where the rest of the regex can match. That means, if there's more than one foo in the string, your regex will always match the last one. And from that position, the lookahead will always succeed as well.
Regexes that you use for validating have to be more precise than the ones you use for matching. Your regex is failing because the .* can match the sentinel string, 'foo'. You need to actively prevent matches of foo before and after the one you're trying to match. Casimir's answer shows one way to do that; here's another:
"^(?>(?!foo).)*+foo(?>(?!foo).)*+$"
It's not quite as efficient, but I think it's a lot easier to read. In fact, you could probably use this regex:
"^(?!.*foo.*foo).+$"
It's a great deal more inefficient, but a complete regex n00b would probably figure out what it does.
Finally, notice that none of theses regexes--mine or Casimir's--uses lookbehinds. I know it seems like the perfect tool for the job, but no. In fact, lookbehind should never be the first tool you reach for. And not just in Java. Whatever regex flavor you use, it's almost always easier to match the whole string in the normal way than it is to use lookbehinds. And usually much more efficient, too.
Someone answered the question, but deleted it ,
The following short code works correctly :
static boolean testRegEx(String str){
return !str.matches("(.*?foo.*){0}|(.*?foo.*){2,}");
}
Any idea on how to invert the result inside the regex itself ?

Need regex to match the given string

I need a regex to match a particular string, say 1.4.5 in the below string . My string will be like
absdfsdfsdfc1.4.5kdecsdfsdff
I have a regex which is giving [c1.4.5k] as an output. But I want to match only 1.4.5. I have tried this pattern:
[^\\W](\\d\\.\\d\\.\\d)[^\\d]
But no luck. I am using Java.
Please let me know the pattern.
When I read your expression [^\\W](\\d\\.\\d\\.\\d)[^\\d] correctly, then you want a word character before and not a digit ahead. Is that correct?
For that you can use lookbehind and lookahead assertions. Those assertions do only check their condition, but they do not match, therefore that stuff is not included in the result.
(?<=\\w)(\\d\\.\\d\\.\\d)(?!\\d)
Because of that, you can remove the capturing group. You are also repeating yourself in the pattern, you can simplify that, too:
(?<=\\w)\\d(?:\\.\\d){2}(?!\\d)
Would be my pattern for that. (The ?: is a non capturing group)
Your requirements are vague. Do you need to match a series of exactly 3 numbers with exactly two dots?
[0-9]+\.[0-9]+\.[0-9]+
Which could be written as
([0-9]+\.){2}[0-9]+
Do you need to match x many cases of a number, seperated by x-1 dots in between?
([0-9]+\.)+[0-9]+
Use look ahead and look behind.
(?<=c)[\d\.]+(?=k)
Where c is the character that would be immediately before the 1.4.5 and k is the character immediately after 1.4.5. You can replace c and k with any regular expression that would suit your purposes
I think this one should do it : ([0-9]+\\.?)+
Regular Expression
((?<!\d)\d(?:\.\d(?!\d))+)
As a Java string:
"((?<!\\d)\\d(?:\\.\\d(?!\\d))+)"
String str= "absdfsdfsdfc**1.4.5**kdec456456.567sdfsdff22.33.55ffkidhfuh122.33.44";
String regex ="[0-9]{1}\\.[0-9]{1}\\.[0-9]{1}";
Matcher matcher = Pattern.compile( regex ).matcher( str);
if (matcher.find())
{
String year = matcher.group(0);
System.out.println(year);
}
else
{
System.out.println("no match found");
}

Categories