This question already has answers here:
Java RegEx meta character (.) and ordinary dot?
(9 answers)
Closed 7 years ago.
What is the regular expression for . and .. ?
if(key.matches(".")) {
do something
}
The matches accepts String which asks for regular expression. Now i need to remove all DOT's inside my MAP.
. matches any character so needs escaping i.e. \., or \\. within a Java string (because \ itself has special meaning within Java strings.)
You can then use \.\. or \.{2} to match exactly 2 dots.
...
[.]{1}
or
[.]{2}
?
[+*?.] Most special characters have no meaning inside the square brackets. This expression matches any of +, *, ? or the dot.
Use String.Replace() if you just want to replace the dots from string. Alternative would be to use Pattern-Matcher with StringBuilder, this gives you more flexibility as you can find groups that are between dots. If using the latter, i would recommend that you ignore empty entries with "\\.+".
public static int count(String str, String regex) {
int i = 0;
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(str);
while (m.find()) {
m.group();
i++;
}
return i;
}
public static void main(String[] args) {
int i = 0, j = 0, k = 0;
String str = "-.-..-...-.-.--..-k....k...k..k.k-.-";
// this will just remove dots
System.out.println(str.replaceAll("\\.", ""));
// this will just remove sequences of ".." dots
System.out.println(str.replaceAll("\\.{2}", ""));
// this will just remove sequences of dots, and gets
// multiple of dots as 1
System.out.println(str.replaceAll("\\.+", ""));
/* for this to be more obvious, consider following */
System.out.println(count(str, "\\."));
System.out.println(count(str, "\\.{2}"));
System.out.println(count(str, "\\.+"));
}
The output will be:
--------kkkkk--
-.--.-.-.---kk.kk.k-.-
--------kkkkk--
21
7
11
You should use contains not matches
if(nom.contains("."))
System.out.println("OK");
else
System.out.println("Bad");
Related
This question already has answers here:
Regex: match only outside parenthesis (so that the text isn't split within parenthesis)?
(2 answers)
Closed 3 years ago.
I (Regex noob) am trying to perform replace operation on a string containing some pattern. For example
AAA-BBB-CCC-{AAA-BBB-AAA-BBB}-CCC-BBB-AAA
In the above I am trying to replace all As with Is but ignore the As inside curly braces.
For this what I could do is to split the entire string on the pattern and perform replace then concatenate the strings.
I was wondering if there is a shorter way in regex so that I could perform something like
String str = "AAA-BBB-CCC-{AAA-BBB-AAA-BBB}-CCC-BBB-AAA";
str = str.replaceButIgnorePattern("A", "I","\\{(.*?)\\}");
System.out.print(str); //III-BBB-CCC-{AAA-BBB-AAA-BBB}-CCC-BBB-III
And the pattern can be like
contains any character
can be at starting, in between or at the end of the string
Considering there are no nested braces, a solution is to match a substring inside the closest { and } and match and capture the pattern to replace, and then check if the Group 1 is not null and then act accordingly.
In Java 9+, you may use
String text = "AAA-BBB-CCC-{AAA-BBB-AAA-BBB}-CCC-BBB-AAA";
Pattern r = Pattern.compile("\\{[^{}]*}|(A)");
Macher m = r.matcher(text);
String result = m.replaceAll(x -> x.group(1) != null ? "I" : x.group() );
System.out.println( result );
See the online demo.
Here, \{[^{}]*} matches {, any 0+ chars other than { and }, and then }, or (|) captures A into Group 1.
Equivalent code for older Java versions:
String text = "AAA-BBB-CCC-{AAA-BBB-AAA-BBB}-CCC-BBB-AAA";
Pattern r = Pattern.compile("\\{[^{}]*}|(A)");
Matcher m = r.matcher(text);
StringBuffer sb = new StringBuffer();
while (m.find()) {
if (m.group(1) == null) {
m.appendReplacement(sb, m.group(0));
} else {
m.appendReplacement(sb, "I");
}
}
m.appendTail(sb);
System.out.println(sb);
See the online Java demo.
You may also use a common workaround for any Java version:
str = str.replaceAll("A(?![^{}]*})", "I");
where (?![^{}]*}) makes sure there is no any 0+ occurrences of { and } followed with a } immediately to the right of the current location. NOTE this approach implies that the string contains a balanced amount of open/close braces.
How can i get a String inside brackets. See code below.
String str = "C1<C2, C3<T1>>.C4<T2>.C5"
I need to get C1<C2, C3<T1>>, C4<T2>, and C5.
See code what I tried below
Pattern pat = Pattern.compile("(\\w+(<[^>]+>)?)(.\\w+(<[^>]+>)?)*");
Matcher mat = pat.matcher(str);
but the result was
C1<C2, C3<T1>
There are 2 problems that I see with your code:
It seems like you are only printing the first match instead of
looping through the results. Use while(mat.find()) to iterate
through the list of matches.
Simplify your pattern to \\w+(<[^>]+>+)? to get C1<C2, C3<T1>>, C4<T2>, and C5.
RegEx pattern explained:
w+= 1 or more alphanumeric or underscore character
()? = 0 or 1 of what is in the parenthesis
< = match the < character
[^>]+ = 1 or more sets characters until the > character
>+ = 1 or more > character (An alternative would be >{1,2} if you want to enforce only either one or two > characters.)
Your resulting code should look like the following:
public static void main(String[] args)
{
String str = "C1<C2, C3<T1>>.C4<T2>.C5";
Pattern pat = Pattern.compile("\\w+(<[^>]+>+)?");
Matcher mat = pat.matcher(str);
while(mat.find()) {
System.out.println(mat.group());
}
}
If you just want a list of the parts though, a much simpler way to accomplish this would be to use split() instead of RegEx. You can split the string on ., save the pieces in an array and then iterate through the array as so desired.
That would be accomplished with the following:
String[] parts = str.split("\\.");
Just split on dots:
String[] parts = str.split("\\.");
This does what you want using the sample input in the question.
This question already has answers here:
Regex to replace repeated characters
(2 answers)
Closed 6 years ago.
I am trying to replace all the repeated characters from a String in Java, and let only one.
For example:
aaaaa ---> a
For that, I have tried using the replaceAll method:
"aaaaa".replaceAll("a*","a") //returns "aa"
I have developed a recursive method, which is probably not very efficient:
public String recursiveReplaceAll(String original,String regex, String replacement) {
if (original.equals(original.replaceAll(regex, replacement))) return original;
return recursiveReplaceAll(original.replaceAll(regex, replacement),regex,replacement);
}
This method works, I was just wondering if there was anything using RegEx for example, which does the work with better performance.
Your replaceAll approach was nearly right - it's just that * matches 0 occurrences. You want + to mean "one or more".
"aaaaa".replaceAll("a+","a") // Returns "a"
You can do it without recursion. The regular expression "(.)\\1+" will capture every character followed by themselves at least once, and it replaces them with the captured character. Thus, this removes any repeated characters.
public static void main(String[] args) {
String str = "aaaabbbaaa";
String result = str.replaceAll("(.)\\1+", "$1");
System.out.println(result); // prints "aba".
}
With this, it works for all characters.
This question already has an answer here:
Split regex to extract Strings of contiguous characters
(1 answer)
Closed 7 years ago.
I'm new to using regular expressions, but I think that in an instance like this using them would be the quickest and most ellegant way. I have a binary string, and I need to split it into groups that only contain consecutive zeros or ones, for example:
110001
would be split into
11
000
1
I just can't figure it out, this is my current code, thanks:
class Solution {
public static void main(String args[]) {
String binary = Integer.toBinaryString(67);
String[] exploded = binary.split("0+| 1+");
for(String string : exploded) {
System.out.println(string);
}
}
}
}
Try
public class Solution {
public static void main(String[] args) {
String binary = Integer.toBinaryString(67);
System.out.println(binary);
Pattern pattern = Pattern.compile("0+|1+");
Matcher matcher = pattern.matcher(binary);
while (matcher.find()) {
System.out.println(matcher.group(0));
}
}
}
Rather than split you can use match using this regex and use captured group #1 for your matches:
(([01])\2*)
RegEx Demo
if you want to use split (?<=0)(?=1)|(?<=1)(?=0).
Not sure what you want to do with all zero's or all one's though.
The method split requires a pattern to describe the separator. To achieve you goal you have to describe a location (you don’t want to consume characters at the split position) between the groups:
public static void main(String args[]) {
String binary = Integer.toBinaryString(67);
String[] exploded = binary.split("(?<=([01]))(?!\\1)");
for(String string : exploded) {
System.out.println(string);
}
}
(?<=([01])) describes via “look-behind” that before the splitting position, there must be either 1 or 0, and captures the character in a group. (?!\\1) specifies via “negative look-ahead” that the character after the split position must be different than the character found before the position.
Which is exactly what is needed to split into groups of the same character. You could replace [01] with . here to make it a general solution for splitting into groups having the same character, regardless of which one.
The reason it's not working is because of the nature of the split method. The found pattern will not be included in the array. You would need to use a regex search instead.
This question already has answers here:
What do 'lazy' and 'greedy' mean in the context of regular expressions?
(13 answers)
Closed 8 years ago.
just experiencing some problems with Java Regular expressions.
I have a program that reads through an HTML file and replaces any string inside the #VR# characters, i.e. #VR#Test1 2 3 4#VR#
However my issue is that, if the line contains more than two strings surrounded by #VR#, it does not match them. It would match the leftmost #VR# with the rightmost #VR# in the sentence and thus take whatever is in between.
For example:
#VR#Google#VR#
My code would match
URL-GOES-HERE#VR#" target="_blank" style="color:#f4f3f1; text-decoration:none;" title="ContactUs">#VR#Google
Here is my Java code. Would appreciate if you could help me to solve this:
Pattern p = Pattern.compile("#VR#.*#VR#");
Matcher m;
Scanner scanner = new Scanner(htmlContent);
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
m = p.matcher(line);
StringBuffer sb = new StringBuffer();
while (m.find()) {
String match_found = m.group().replaceAll("#VR#", "");
System.out.println("group: " + match_found);
}
}
I tried replacing m.group() with m.group(0) and m.group(1) but nothing. Also m.groupCount() always returns zero, even if there are two matches as in my example above.
Thanks, your help will be very much appreciated.
Your problem is that .* is "greedy"; it will try to match as long a substring as possible while still letting the overall expression match. So, for example, in #VR# 1 #VR# 2 #VR# 3 #VR#, it will match 1 #VR# 2 #VR# 3.
The simplest fix is to make it "non-greedy" (matching as little as possible while still letting the expression match), by changing the * to *?:
Pattern p = Pattern.compile("#VR#.*?#VR#");
Also m.groupCount() always returns zero, even if there are two matches as in my example above.
That's because m.groupCount() returns the number of capture groups (parenthesized subexpressions, whose corresponding matched substrings retrieved using m.group(1) and m.group(2) and so on) in the underlying pattern. In your case, your pattern has no capture groups, so m.groupCount() returns 0.
You can try the regular expression:
#VR#(((?!#VR#).)+)#VR#
Demo:
private static final Pattern REGEX_PATTERN =
Pattern.compile("#VR#(((?!#VR#).)+)#VR#");
public static void main(String[] args) {
String input = "#VR#Google#VR# ";
System.out.println(
REGEX_PATTERN.matcher(input).replaceAll("$1")
); // prints "Google "
}