This question already has answers here:
Java- Extract part of a string between two special characters
(7 answers)
Closed 6 days ago.
Let's say i have a string like this:
#hello there¥
#im bojack¥
#from bojack horseman¥
%some middle text%
& Hello I'm todd ₹
# meow meow ¥
# im pink cat ¥
% some 2nd middle text %
& Some other text ₹
# Bow Bow im dog ¥
I basically want to match everything from last ¥ to & including those two characters to get something like
¥ %some middle text% & with regex in Java
From ignoring repeations i mean to ignore duplicate ¥ in above text and only match from last ¥ cause whatever patterns I've tried marches from first ¥ to &
You can use a negative lookbehind:
(?<!#)(\¥.*&)
This will match ¥ followed by any characters up to &, as long as it is not preceded by a #.
Would you please try:
¥[^¥&]+&
Demo
Related
This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 2 years ago.
I am recently learning regex and i am not quite sure how the following regex works:
str.replaceAll("(\\w)(\\w*)", "$2$1ay");
This allows us to do the following:
input string: "Hello World !"
return string: "elloHay orldWay !"
From what I know: w is supposed to match all word characters including 0-9 and underscore and $ matches stuff at the end of string.
In the replaceAll method, the first parameter can be a regex. It matches all words in the string with the regex and changes them to the second parameter.
In simple cases replaceAll works like this:
str = "I,am,a,person"
str.replaceAll(",", " ") // I am a person
It matched all the commas and replaced them with a space.
In your case, the match is every alphabetic character(\w), followed by a stream of alphabetic characters(\w*).
The () around \w is to group them. So you have two groups, the first letter and the remaining part. If you use regex101 or some similar website you can see a visualization of this.
Your replacement is $2 -> Second group, followed by $1(remaining part), followed by ay.
Hope this clears it up for you.
Enclosing a regex expression in brackets () will make it a Capturing group.
Here you have 2 capturing groups , (\w) captures a single word character, and (\w*) catches zero or more.
$1 and $2 are used to refer to the captured groups, first and second respectively.
Also replaceAll takes each word individually.
So in this example in 'Hello' , 'H' is the first captured groups and 'ello' is the second. It's replaced by a reordered version - $2$1 which is basically swapping the captured groups.
So you get '$2$1ay' as 'elloHay'
The same for the next word also.
This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 2 years ago.
Hi there I'm new to Java and was going through some information on regex and I couldn't comprehend this the following expression:
"^[a-zA-Z\-]+$"
Could someone be kind enough to explain each and every character in this expression?
Thank you.
^ $ # Check if the entire string matches,
[ ]+ # with one or more of the following characters:
a-z # Any lowercase (ASCII) letter
A-Z # Any uppercase (ASCII) letter
\- # Or an "-" (the `\` is used to escape it)
Or in short: this regex checks if a given string consists solely of (ASCII) letters and/or -, and is non-empty.
Try it online.
[a-zA-Z] means all characters a through or A through Z, inclusive.
The "\" inside the square bracket is used as an escape character.
Symbol "+" in the end signified that your regex can occur once or more times.
This question already has answers here:
Is it possible to build a Pattern based on two sub patterns in Java
(2 answers)
Closed 3 years ago.
I am trying to make it so the user can enter 07XXXXXXXXX or 07XXX XXXXXX, both should work, but my RegEx seems to not be working correctly
I have tried adding \\d, \\s, \\w and none of that has worked
My code is:
//The space should be between {3} and [0-9]
while (!Scanner.hasNext("07[0-9]{9} | 07[0-9]{3} [0-9]{6}")) {
System.out.println("\nThe mobile number you have entered is not valid.\nIt should start with 07 and should contain 9 additional digits.");
Scanner.nextLine();
} //end while
The program should accept both 07XXXXXXXXX and 07XXX XXXXXX, but when i try and include the space, it doesn't accept either option.
"07[0-9]{9}" works for 07XXXXXXXXX on its own, but when I add "| 07[0-9]{3} [0-9]{6}", neither option works.
If you follow the space with a ? question mark it will mean "zero or one of" the space
07[0-9]{3} ?[0-9]{6}
You can also use number pairs in brackets to give a range:
07[0-9]{3} {0,1}[0-9]{6}
If you plan to allow other characters like hyphen you can make a character class for those. This allows space, hyphen or dot:
07[0-9]{3}[-. ]{0,1}[0-9]{6}
07[0-9]{3}[-. ]?[0-9]{6}
I recommend you put hyphen first in a character class as it can be taken to mean "range" like you have with 0-9. By putting it first it doesn't define a range. Also a period normally means "any character" but inside a character class it loses this special meaning and is literally just a period
This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 5 years ago.
Is there exist regex pattern that includes (part1|part2|...) and [part] that do:
(part1|part2) will match either part 1, or part 2, e.g. leav(e|ing) matches leave and leaving
[part] is an optional word, e.g. cat[s] will match cat and cats
I also want to soild words that must be in every pattern e.g. give cat[s] will match give cat and give cats
\bcats?\b will match both cat, cats but will not match cat in cater
\bleav(?:e|ing)\b will match both leave and leaving
\bpart(?:1|2|3)?\b will match part1,part2,part3 or part but not part in apart or partner
Explanation
\b // Forces a word boundary so that it does not match in the middle of a word like part in apart
(?: //Non capturing group so that we do not have extra groups in the matches, using this is a matter of choice
| //OR
? //Previous char in cats previous group in (?:1|2|3) is optional
Note that you need to escape the \ in \b while initializing the regex string.
This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 5 years ago.
**(?i)\\b([a-z]+)\\b(?:\\s+\\1\\b)+**
I understand what each symbol means but when the symbols are combined...I can't figure out. The confusion part is (?:\s+\1\b)+. What does it mean??? Can you explain to me?? Thanks for your time!
Individual parts of (?:\s+\1\b)+ have the following meaning:
(?:...) - a non-capturing group. It contains:
\s+ - a non-empty sequence of white space chars.
\1 - a backreference to capturing group #1 (\b([a-z]+)\b).
It means that you want to have here just the same chars (the repeated word)
which has been just captured.
\b - a word boundary, in this case transition from word area to space area.
After the whole above group there is a + sign, meaning that you want to
match as many repeating words as possible.