This question already has answers here:
Regex: match only outside parenthesis (so that the text isn't split within parenthesis)?
(2 answers)
Closed 3 years ago.
I (Regex noob) am trying to perform replace operation on a string containing some pattern. For example
AAA-BBB-CCC-{AAA-BBB-AAA-BBB}-CCC-BBB-AAA
In the above I am trying to replace all As with Is but ignore the As inside curly braces.
For this what I could do is to split the entire string on the pattern and perform replace then concatenate the strings.
I was wondering if there is a shorter way in regex so that I could perform something like
String str = "AAA-BBB-CCC-{AAA-BBB-AAA-BBB}-CCC-BBB-AAA";
str = str.replaceButIgnorePattern("A", "I","\\{(.*?)\\}");
System.out.print(str); //III-BBB-CCC-{AAA-BBB-AAA-BBB}-CCC-BBB-III
And the pattern can be like
contains any character
can be at starting, in between or at the end of the string
Considering there are no nested braces, a solution is to match a substring inside the closest { and } and match and capture the pattern to replace, and then check if the Group 1 is not null and then act accordingly.
In Java 9+, you may use
String text = "AAA-BBB-CCC-{AAA-BBB-AAA-BBB}-CCC-BBB-AAA";
Pattern r = Pattern.compile("\\{[^{}]*}|(A)");
Macher m = r.matcher(text);
String result = m.replaceAll(x -> x.group(1) != null ? "I" : x.group() );
System.out.println( result );
See the online demo.
Here, \{[^{}]*} matches {, any 0+ chars other than { and }, and then }, or (|) captures A into Group 1.
Equivalent code for older Java versions:
String text = "AAA-BBB-CCC-{AAA-BBB-AAA-BBB}-CCC-BBB-AAA";
Pattern r = Pattern.compile("\\{[^{}]*}|(A)");
Matcher m = r.matcher(text);
StringBuffer sb = new StringBuffer();
while (m.find()) {
if (m.group(1) == null) {
m.appendReplacement(sb, m.group(0));
} else {
m.appendReplacement(sb, "I");
}
}
m.appendTail(sb);
System.out.println(sb);
See the online Java demo.
You may also use a common workaround for any Java version:
str = str.replaceAll("A(?![^{}]*})", "I");
where (?![^{}]*}) makes sure there is no any 0+ occurrences of { and } followed with a } immediately to the right of the current location. NOTE this approach implies that the string contains a balanced amount of open/close braces.
Related
This question already has answers here:
Regex: match everything but a specific pattern
(6 answers)
Closed 3 years ago.
I am exploring java regex groups and I am trying to replace a string with some characters.
I have a string str = "abXYabcXYZ"; and I am trying to replace all characters except for the pattern group abc in string.
I tried to use str.replaceAll("(^abc)",""), but it did not work. I understand that (abc) will match a group.
You might find it easier to find the parts you want to keep and just build a new string. There are flaws with this issue with overlapping patterns, but it will likely be good enough for your use case. However, if your pattern really is as simple as "abc" then you may want to instead consider just counting the total number of matches.
String str = "abXYabcXYZ";
Pattern patternToKeep = Pattern.compile("abc");
MatchResult matches = patternToKeep.matcher(str).toMatchResult();
StringBuilder sb = new StringBuilder();
for (int i = 1; i < matches.groupCount(); i++) {
sb.append(matches.group(i));
}
System.out.println(sb.toString());
It is easier to keep the matching parts of the pattern and concatenate them. In the following example the matcher iterates with find() over str and match the next pattern. In the loop your "abc" pattern will be always found at group(0).
String str = "abXYabcXYZabcxss";
Pattern pattern = Pattern.compile("abc");
StringBuilder sb = new StringBuilder();
Matcher matcher = pattern.matcher(str);
while(matcher.find()){
sb.append(matcher.group(0));
}
System.out.println(sb.toString());
For only replacing, the nearest you can get would be:
((?!abc).)*
But with the problem that only the a's of abc would not be replaced.
Regex101 example
This question already has answers here:
Java / Replace all Quotation mark
(2 answers)
Closed 3 years ago.
I have a messed up String Like:
String text= "'xhxyxhzx'xcxz" ";
and I want to replaceAll() the other strings with empty except the one starting with '
Something like this:
String cleartext = "";
if (text.contains("'"))
cleartext = text.replaceAll("[text.startingWith("'a-z" + "'0-9")]", "");
out.println(cleartext);
So the output is 'h' 'e' 'll' 'o'
Note: I just found it kinda possible to make it with the replace method but if there are other ways that this can be achieved I don't mind. MASSIVE Thank you!
According to me we can do one thing. I hope you don't mind a no code answer.
Split the string through the character ' and place it into array of string. For example String "h'e'll'o'." becomes h , e , ll , o , .
Disregard all odd number indexes. The string in the even index would be the one inside the ' character. Example in above is "e , o"
Output the string array even indexes or create a new array by step 2.
I think you're looking for something like this.
Pattern pattern = Pattern.compile("'[a-z0-9]+'");
private String function(final String input) {
final Matcher matcher = pattern.matcher(input);
final StringBuilder sb = new StringBuilder();
while (matcher.find()) {
if (sb.length() > 0) {
sb.append(" ");
}
sb.append(matcher.group());
}
return sb.toString();
}
However, I'm not really sure about the rules you want to apply in order to get the expected result. Ex: "'''" => "'" ?
I'm trying to replace a url string to lowercase but wanted to keep the certain pattern string as it is.
eg: for input like:
http://BLABLABLA?qUERY=sth¯o1=${MACRO_STR1}¯o2=${macro_str2}
The expected output would be lowercased url but the multiple macros are original:
http://blablabla?query=sth¯o1=${MACRO_STR1}¯o2=${macro_str2}
I was trying to capture the strings using regex but didn't figure out a proper way to do the replacement. Also it seemed using replaceAll() doesn't do the job. Any hint please?
It looks like you want to change any uppercase character which is not inside ${...} to its lowercase form.
With construct
Matcher matcher = ...
StringBuffer buffer = new StringBuffer();
while (matcher.find()){
String matchedPart = ...
...
matcher.appendReplacement(buffer, replacement);
}
matcher.appendTail(buffer);
String result = buffer.toString();
or since Java 9 we can use Matcher#replaceAll​(Function<MatchResult,String> replacer) and rewrite it like
String replaced = matcher.replaceAll(m -> {
String matchedPart = m.group();
...
return replacement;
});
you can dynamically build replacement based on matchedPart.
So you can let your regex first try to match ${...} and later (when ${..} will not be matched because regex cursor will not be placed before it) let it match [A-Z]. While iterating over matches you can decide based on match result (like its length or if it starts with $) if you want to use use as replacement its lowercase form or original form.
BTW regex engine allows us to place in replacement part $x (where x is group id) or ${name} (where name is named group) so we could reuse those parts of match. But if we want to place ${..} as literal in replacement we need to escape \$. To not do it manually we can use Matcher.quoteReplacement.
Demo:
String yourUrlString = "http://BLABLABLA?qUERY=sth¯o1=${MACRO_STR1}¯o2=${macro_str2}";
Pattern p = Pattern.compile("\\$\\{[^}]+\\}|[A-Z]");
Matcher m = p.matcher(yourUrlString);
StringBuffer sb = new StringBuffer();
while(m.find()){
String match = m.group();
if (match.length() == 1){
m.appendReplacement(sb, match.toLowerCase());
} else {
m.appendReplacement(sb, Matcher.quoteReplacement(match));
}
}
m.appendTail(sb);
String replaced = sb.toString();
System.out.println(replaced);
or in Java 9
String replaced = Pattern.compile("\\$\\{[^}]+\\}|[A-Z]")
.matcher(yourUrlString)
.replaceAll(m -> {
String match = m.group();
if (match.length() == 1)
return match.toLowerCase();
else
return Matcher.quoteReplacement(match);
});
System.out.println(replaced);
Output: http://blablabla?query=sth¯o1=${MACRO_STR1}¯o2=${macro_str2}
This regex will match all the characters before the first ¯o, and put everything between http:// and the first ¯o in its own group so you can modify it.
http://(.*?)¯o
Tested here
UPDATE: If you don't want to use groups, this regex will match only the characters between http:// and the first ¯o
(?<=http://)(.*?)(?=¯o)
Tested here
I have a string like this:
something:POST:/some/path
Now I want to take the POST alone from the string. I did this by using this regex
:([a-zA-Z]+):
But this gives me a value along with colons. ie I get this:
:POST:
but I need this
POST
My code to match the same and replace it is as follows:
String ss = "something:POST:/some/path/";
Pattern pattern = Pattern.compile(":([a-zA-Z]+):");
Matcher matcher = pattern.matcher(ss);
if (matcher.find()) {
System.out.println(matcher.group());
ss = ss.replaceFirst(":([a-zA-Z]+):", "*");
}
System.out.println(ss);
EDIT:
I've decided to use the lookahead/lookbehind regex since I did not want to use replace with colons such as :*:. This is my final solution.
String s = "something:POST:/some/path/";
String regex = "(?<=:)[a-zA-Z]+(?=:)";
Matcher matcher = Pattern.compile(regex).matcher(s);
if (matcher.find()) {
s = s.replaceFirst(matcher.group(), "*");
System.out.println("replaced: " + s);
}
else {
System.out.println("not replaced: " + s);
}
There are two approaches:
Keep your Java code, and use lookahead/lookbehind (?<=:)[a-zA-Z]+(?=:), or
Change your Java code to replace the result with ":*:"
Note: You may want to define a String constant for your regex, since you use it in different calls.
As pointed out, the reqex captured group can be used to replace.
The following code did it:
String ss = "something:POST:/some/path/";
Pattern pattern = Pattern.compile(":([a-zA-Z]+):");
Matcher matcher = pattern.matcher(ss);
if (matcher.find()) {
ss = ss.replaceFirst(matcher.group(1), "*");
}
System.out.println(ss);
UPDATE
Looking at your update, you just need ReplaceFirst only:
String result = s.replaceFirst(":[a-zA-Z]+:", ":*:");
See the Java demo
When you use (?<=:)[a-zA-Z]+(?=:), the regex engine checks each location inside the string for a * before it, and once found, tries to match 1+ ASCII letters and then assert that there is a : after them. With :[A-Za-z]+:, the checking only starts after a regex engine found : character. Then, after matching :POST:, the replacement pattern replaces the whole match. It is totlally OK to hardcode colons in the replacement pattern since they are hardcoded in the regex pattern.
Original answer
You just need to access Group 1:
if (matcher.find()) {
System.out.println(matcher.group(1));
}
See Java demo
Your :([a-zA-Z]+): regex contains a capturing group (see (....) subpattern). These groups are numbered automatically: the first one has an index of 1, the second has the index of 2, etc.
To replace it, use Matcher#appendReplacement():
String s = "something:POST:/some/path/";
StringBuffer result = new StringBuffer();
Matcher m = Pattern.compile(":([a-zA-Z]+):").matcher(s);
while (m.find()) {
m.appendReplacement(result, ":*:");
}
m.appendTail(result);
System.out.println(result.toString());
See another demo
This is your solution:
regex = (:)([a-zA-Z]+)(:)
And code is:
String ss = "something:POST:/some/path/";
ss = ss.replaceFirst("(:)([a-zA-Z]+)(:)", "$1*$3");
ss now contains:
something:*:/some/path/
Which I believe is what you are looking for...
This question already has answers here:
Java RegEx meta character (.) and ordinary dot?
(9 answers)
Closed 7 years ago.
What is the regular expression for . and .. ?
if(key.matches(".")) {
do something
}
The matches accepts String which asks for regular expression. Now i need to remove all DOT's inside my MAP.
. matches any character so needs escaping i.e. \., or \\. within a Java string (because \ itself has special meaning within Java strings.)
You can then use \.\. or \.{2} to match exactly 2 dots.
...
[.]{1}
or
[.]{2}
?
[+*?.] Most special characters have no meaning inside the square brackets. This expression matches any of +, *, ? or the dot.
Use String.Replace() if you just want to replace the dots from string. Alternative would be to use Pattern-Matcher with StringBuilder, this gives you more flexibility as you can find groups that are between dots. If using the latter, i would recommend that you ignore empty entries with "\\.+".
public static int count(String str, String regex) {
int i = 0;
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(str);
while (m.find()) {
m.group();
i++;
}
return i;
}
public static void main(String[] args) {
int i = 0, j = 0, k = 0;
String str = "-.-..-...-.-.--..-k....k...k..k.k-.-";
// this will just remove dots
System.out.println(str.replaceAll("\\.", ""));
// this will just remove sequences of ".." dots
System.out.println(str.replaceAll("\\.{2}", ""));
// this will just remove sequences of dots, and gets
// multiple of dots as 1
System.out.println(str.replaceAll("\\.+", ""));
/* for this to be more obvious, consider following */
System.out.println(count(str, "\\."));
System.out.println(count(str, "\\.{2}"));
System.out.println(count(str, "\\.+"));
}
The output will be:
--------kkkkk--
-.--.-.-.---kk.kk.k-.-
--------kkkkk--
21
7
11
You should use contains not matches
if(nom.contains("."))
System.out.println("OK");
else
System.out.println("Bad");