Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I need a Regex expression for the following:
9845d530c7594ab45e8b905bbff
It should always start with 984, then have a UUID and the maximum length should be 27. Over here it is 5d530c7594ab45e8b905bbff (the UUID).I know for the UUID but I am not sure how to combine it in one.
For starting with 984 it should be
^984
And for a specific length it should be
\d{27}
But I am not sure about UUID(which over here should be case insensitive).
^984[0-9a-fA-F]{24}$
A quick explanation:
^984 must begin with "984"
[...]{24}$ 24 characters that match the given character set, then end
[0-9a-fA-F] a character set that includes any number 0-9, character a-f or character A-F
You can also use character classes \d for the numeric portion (must be a single numeric digit), but I like to be explicit, because otherwise my brain hurts. Character classes are useful if you're running against an unknown character set that might have multiple representations for a number. For instance, \d might match the Arabic numbers (٠ ١ ٢ ٣ ٤ ٥ ٦ ٧ ٨ ٩) or Devanagari numbers (० १ २ ३ ४ ५ ६ ७ ८ ९).
There are also regex "modifiers" that allow case insensitivity without having to specify the upper and lower case in the character set. It does depend on which regex implementation you use.
Java's built-in regular expressions library has a modifier flag:
Pattern.compile("^984[0-9a-f]$",CASE_INSENSITIVE);
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I'm working on a regex for getting a specific number pattern from the URL string.
Requirements: Desire string should start from - or /, followed by a sequence of digits and ending with a / or nothing.
I tried: [-\/](\d+)(\/|$), but for e.g. in www.abc.com/pages/Toms-1777/14623420046 I want /14623420046(i.e. the second occurring digit sequence), but according to my regex, the result will be -1777/. I was trying negative lookbehind but not able to make any progress. I'm new to all this. Please guide.
Test cases: (with matched pattern)
www.abc.com/pages/Essen-Massage-Therapy-LLC/130561253629638
www.abc.com/biz/finn-mccools-santa-monica-2
www.abc.com/summerset.gardens.7
www.abc.com/pages/Toms-1777/14623420046
www.abc.com/pages/The-Clean-Masters/1403753595526512
www.abc.com/24hfsheepsheadbay
www.abc.com/sample2NVCoolSpace
www.abc.com/pages/Jet-Set-3920/542495615847409
www.abc.com/temp.buildings.77
www.abc.com/2423423453534temp/2312312312312312312
www.abc.com/Ptemp-Gtemp-Dtemp-189398324428792/temp
You want that $ in either case. Instead of 'slash OR end', it's more 'optional slash and then a very much not-optional end'. So.. /?$. You don't need to normally escape slashes, especially in java regexes:
Pattern.compile("[-/](\\d+)/?$")
This reads: From minus or slash, a bunch of digits, then 0 or 1 slashes, then the end. Note, use find() and not matches() - matches only works if the entire string matches, which it won't, as the - or / occurs halfway through.
EDIT: Was missing a backslash in the java string.
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
In a Java app, I use this regex: (\w+)_\d to match patterns of this form:
apples_1
oranges_2
and then I use the first capturing group value (apples, oranges).
However, I now have a new request to also match these strings:
applesdrp_1
orangesdrp_2
where 'drp' is a fixed 3 character string, and the same values as before need to be captured: apples, oranges
So for example, if I use this regex: (\w+)(?:drp)?_\d
it will do the work on apples_1, but not for applesdrp_1.
Is there a way to do that with a regex?
You can use a non-greedy quantifier:
(\w+?)(?:drp)?_\d
In this way \w+? will take characters until it find "drp_N" or "_N" (where N is a digit).
If you use a greedy quantifier, \w+ takes all possible character (including the underscore and the digit since they are included in \w) and then gives back characters one by one until (?:drp)?_\d succeeds. But since (?:drp)? is optional, the regex engine stops to backrack when it find _N.
Yes, you can - one way would be using a negative lookbehind, to make sure, that the drp is forced outside the group, if it is present
(\w+)(?<!drp)(?:drp)?_\d+
See https://regex101.com/r/jJ1rM4/3 for a demo
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
My objective is to separate the numbers from the string, but in my array's first position i get a blank space. So i need help for that not to happen.
str1 = "Y=9x1+29x2";
String[] split2 = str2.split("[^-?.?0-9]+");
Blank space at the start is due to presence of non-digit character at the start of your input.
You can remove all non-digits at start before splitting:
String linha = "Y=9x1+29x2";
String[] split = linha.replaceFirst("[^-.\\d]+", "").split("[^-.\\d]+");
for (String tok: split2)
System.out.println(tok);
Output:
9
1
29
2
I think your question is rather vague, but after looking at it, I'm guessing that you want to extract the numbers out of the string, where a "number" has this format: an optional minus sign, followed by an optional decimal point, followed by one or more digits. I suspect you also want to include numbers that have digits followed by a decimal point followed by more digits.
I'm guessing this is what you want, because of the ? you put in your regex. The problem is that inside square brackets, ? doesn't mean "optional", and it doesn't mean "zero or one of something". It means a question mark. The regex [^-?.?0-9] means "match one character that is not a digit, a period, a hyphen, or a question mark". A pattern in square brackets always matches one character, and you tell it what characters are OK (or, if you begin with ^, what characters are not OK). This kind of "character set" pattern never matches a sequence of characters. It just looks at one character at a time. If you put + after the pattern, it still looks at one character at a time; it just does so repeatedly.
I think what you're trying to do is to take a pattern that represents a number, and then say "look for something that doesn't look like that pattern", and you tried to do it by using [^...]. That simply will not work.
In fact, split() is the wrong tool for this job. The purpose of split is to break up a string whose delimiters match a given pattern. Using it when the strings you want to keep in the array match a given pattern doesn't work very well, unless the pattern is extremely simple. I recommend that you create a Matcher and use the find() method in a loop. find() is set up so that it can find all matching substrings of a string if you call it repeatedly. This is what you want to accomplish, so it's the right tool.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Regular expression in java:
'String'.replaceAll("([aeioucgjkqsxyzbfpvwdtmn1234567890])\\1+", "$1")
Can someone explain what the different characters do?
Explanation:
[aeioucgjkqsxyzbfpvwdtmn1234567890] Matches a single character in the list.
([aeioucgjkqsxyzbfpvwdtmn1234567890]) Capturing group around the char class would capture that single character.
\1+ \1 is a pointer to refer the chars inside the group index 1. In our case, a single character is captured so it refers to that single character. \1+ means one or more occurrences of the characters inside group index 1.
For Example:
aaaa
The above regex would capture the first character and check if the following one or more characters are same as the first character which was captured. If yes, then the whole duplicated chars are replaced by a single char(which was inside group index 1 ), that is aaaa was replaced by a single a
DEMO
All letters that are listed between brackets will be replaced by $1 if after them comes a \1, which is a literal backslash one. The plus sign (+) means 1 or more.
Any sequence of 1 or more of the characters inside the brackets [...] will be replaced with $1.
For instance, this will remove all those characters from your string:
System.out.println(Str.replaceAll("([aeioucgjkqsxyzbfpvwdtmn1234567890])\1+", ""));
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I need to match any letters (like MS Office Word find with special character ^$ functionality) with regex.
I've tried with [a-zA-Z] but don't match any unicode letters like accent letters or ä, ö, ü, ß.
I've tried also with [a-zA-ZäöüßÄÖÜ]but there are too many letters.
Is there any regex to match all this letters?
This \\p{L} regex would match any kind of letter from any language.
DEMO
To match any unicode letter in Java use:
\\p{L}
You can use \\p{L} to match any letter, Unicode included.
For fine-tuned matching, you can consult the documentation on filefront, and combine it with the Unicode features documented in Java Pattern here.
Quick example
String input = "ZäöüßÄÖÜß您好";
System.out.println(input.matches(String.format("\\p{L}{%d}", input.length())));
Output
true
It seems you want to match not any letter (eg Arabic characters), but Latin characters:
\p{IsLatin}+
Using your chars:
System.out.println("ZäöüßÄÖÜ".matches("\\p{IsLatin}+")); // true