I have an issue in Java when trying to remove the characters from the end of a string. This has now become a generic pattern match issue that I cannot resolve.
PROBLEM = remove all pluses, minuses and spaces (not bothered about whitespace) from the end of a string.
Pattern myRegex;
Matcher myMatch;
String myPattern = "";
String myString = "";
String myResult = "";
myString="surname; forename--+ + --++ "
myPattern="^(.*)[-+ ]*$"
//expected result = "surname; forename"
myRegex = Pattern.compile(myPattern);
myMatch = myRegex.matcher(myString);
if (myMatch.find( )) {
myResult = myMatch.group(1);
} else {
myResult = myString;
}
The only way I can get this to work is by reversing the string and reversing the pattern match, then I reverse the result to get the right answer!
In the following pattern:
^(.*)[-+ ]*$
... the .* is a greedy match. This means that it will match as many characters as possible while still allowing the entire pattern to match.
You need to change it to non-greedy by adding ?.
^(.*?)[-+ ]*$
Related
Having following string:
String value = "/cds/horse/schema1.0.0/day=12321/provider=samsung/run_key=32ee/group_key=222/end_date=2020-04-20/run_key_default=32sas1/somethingElse=else"
In need to replace values of run_key and run_key_default with %, for example, for above string result output will be the:
"/cds/horse/schema1.0.0/day=12321/provider=samsung/run_key=%/group_key=222/end_date=2020-04-20/run_key_default=%/somethingElse=else"
I would like to avoid mistakenly modifying other values, so in my opinion the best solution for it is combining replaceAll method with regex
String output = value.replaceAll("\run_key=[*]\", "%").replaceAll("\run_key_default=[*]\", "%")
I'm not sure how should I construct regex for it?
Feel free to post if you know better solution for it, than this one which I provided.
You may use this regex for search:
(/run_key(?:_default)?=)[^/]*
and for replacement use:
"$1%"
RegEx Demo
Java Code:
String output = value.replaceAll("(/run_key(?:_default)?=)[^/]*", "$1%");
RegEx Details:
(: Start capture group #1
/run_key: Match literal text /run_key
(?:_default)?: Match _default optionally
=: Match a literal =
): End capture group #1
[^/]*: Match 0 or more of any characters that is not /
"$1%" is replacement that puts our 1st capture group back followed by a literal %
public static void main(String[] args) {
final String regex = "(run_key_default|run_key)=\\w*"; //regex
final String string = "/cds/horse/schema1.0.0/day=12321/provider=samsung/run_key=32ee/group_key=222/end_date=2020-04-20/run_key_default=32sas1/somethingElse=else";
final String subst = "$1=%"; //group1 as it is while remaining part with %
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);
final String result = matcher.replaceAll(subst);
System.out.println("Substitution result: " + result);
}
output
Substitution result:
/cds/horse/schema1.0.0/day=12321/provider=samsung/run_key=%/group_key=222/end_date=2020-04-20/run_key_default=%/somethingElse=else
I'm trying to replace a url string to lowercase but wanted to keep the certain pattern string as it is.
eg: for input like:
http://BLABLABLA?qUERY=sth¯o1=${MACRO_STR1}¯o2=${macro_str2}
The expected output would be lowercased url but the multiple macros are original:
http://blablabla?query=sth¯o1=${MACRO_STR1}¯o2=${macro_str2}
I was trying to capture the strings using regex but didn't figure out a proper way to do the replacement. Also it seemed using replaceAll() doesn't do the job. Any hint please?
It looks like you want to change any uppercase character which is not inside ${...} to its lowercase form.
With construct
Matcher matcher = ...
StringBuffer buffer = new StringBuffer();
while (matcher.find()){
String matchedPart = ...
...
matcher.appendReplacement(buffer, replacement);
}
matcher.appendTail(buffer);
String result = buffer.toString();
or since Java 9 we can use Matcher#replaceAll​(Function<MatchResult,String> replacer) and rewrite it like
String replaced = matcher.replaceAll(m -> {
String matchedPart = m.group();
...
return replacement;
});
you can dynamically build replacement based on matchedPart.
So you can let your regex first try to match ${...} and later (when ${..} will not be matched because regex cursor will not be placed before it) let it match [A-Z]. While iterating over matches you can decide based on match result (like its length or if it starts with $) if you want to use use as replacement its lowercase form or original form.
BTW regex engine allows us to place in replacement part $x (where x is group id) or ${name} (where name is named group) so we could reuse those parts of match. But if we want to place ${..} as literal in replacement we need to escape \$. To not do it manually we can use Matcher.quoteReplacement.
Demo:
String yourUrlString = "http://BLABLABLA?qUERY=sth¯o1=${MACRO_STR1}¯o2=${macro_str2}";
Pattern p = Pattern.compile("\\$\\{[^}]+\\}|[A-Z]");
Matcher m = p.matcher(yourUrlString);
StringBuffer sb = new StringBuffer();
while(m.find()){
String match = m.group();
if (match.length() == 1){
m.appendReplacement(sb, match.toLowerCase());
} else {
m.appendReplacement(sb, Matcher.quoteReplacement(match));
}
}
m.appendTail(sb);
String replaced = sb.toString();
System.out.println(replaced);
or in Java 9
String replaced = Pattern.compile("\\$\\{[^}]+\\}|[A-Z]")
.matcher(yourUrlString)
.replaceAll(m -> {
String match = m.group();
if (match.length() == 1)
return match.toLowerCase();
else
return Matcher.quoteReplacement(match);
});
System.out.println(replaced);
Output: http://blablabla?query=sth¯o1=${MACRO_STR1}¯o2=${macro_str2}
This regex will match all the characters before the first ¯o, and put everything between http:// and the first ¯o in its own group so you can modify it.
http://(.*?)¯o
Tested here
UPDATE: If you don't want to use groups, this regex will match only the characters between http:// and the first ¯o
(?<=http://)(.*?)(?=¯o)
Tested here
I have a string like this:
something:POST:/some/path
Now I want to take the POST alone from the string. I did this by using this regex
:([a-zA-Z]+):
But this gives me a value along with colons. ie I get this:
:POST:
but I need this
POST
My code to match the same and replace it is as follows:
String ss = "something:POST:/some/path/";
Pattern pattern = Pattern.compile(":([a-zA-Z]+):");
Matcher matcher = pattern.matcher(ss);
if (matcher.find()) {
System.out.println(matcher.group());
ss = ss.replaceFirst(":([a-zA-Z]+):", "*");
}
System.out.println(ss);
EDIT:
I've decided to use the lookahead/lookbehind regex since I did not want to use replace with colons such as :*:. This is my final solution.
String s = "something:POST:/some/path/";
String regex = "(?<=:)[a-zA-Z]+(?=:)";
Matcher matcher = Pattern.compile(regex).matcher(s);
if (matcher.find()) {
s = s.replaceFirst(matcher.group(), "*");
System.out.println("replaced: " + s);
}
else {
System.out.println("not replaced: " + s);
}
There are two approaches:
Keep your Java code, and use lookahead/lookbehind (?<=:)[a-zA-Z]+(?=:), or
Change your Java code to replace the result with ":*:"
Note: You may want to define a String constant for your regex, since you use it in different calls.
As pointed out, the reqex captured group can be used to replace.
The following code did it:
String ss = "something:POST:/some/path/";
Pattern pattern = Pattern.compile(":([a-zA-Z]+):");
Matcher matcher = pattern.matcher(ss);
if (matcher.find()) {
ss = ss.replaceFirst(matcher.group(1), "*");
}
System.out.println(ss);
UPDATE
Looking at your update, you just need ReplaceFirst only:
String result = s.replaceFirst(":[a-zA-Z]+:", ":*:");
See the Java demo
When you use (?<=:)[a-zA-Z]+(?=:), the regex engine checks each location inside the string for a * before it, and once found, tries to match 1+ ASCII letters and then assert that there is a : after them. With :[A-Za-z]+:, the checking only starts after a regex engine found : character. Then, after matching :POST:, the replacement pattern replaces the whole match. It is totlally OK to hardcode colons in the replacement pattern since they are hardcoded in the regex pattern.
Original answer
You just need to access Group 1:
if (matcher.find()) {
System.out.println(matcher.group(1));
}
See Java demo
Your :([a-zA-Z]+): regex contains a capturing group (see (....) subpattern). These groups are numbered automatically: the first one has an index of 1, the second has the index of 2, etc.
To replace it, use Matcher#appendReplacement():
String s = "something:POST:/some/path/";
StringBuffer result = new StringBuffer();
Matcher m = Pattern.compile(":([a-zA-Z]+):").matcher(s);
while (m.find()) {
m.appendReplacement(result, ":*:");
}
m.appendTail(result);
System.out.println(result.toString());
See another demo
This is your solution:
regex = (:)([a-zA-Z]+)(:)
And code is:
String ss = "something:POST:/some/path/";
ss = ss.replaceFirst("(:)([a-zA-Z]+)(:)", "$1*$3");
ss now contains:
something:*:/some/path/
Which I believe is what you are looking for...
I want to replace a substring that matches a pattern, only if it does not match a different pattern. For example, in the code shown below, I want to replace all '%s' but leave ':%s' untouched.
String template1 = "Hello:%s";
String template2 = "Hello%s";
String regex = "[%s&&^[:%s]]";
String str = template1.replaceAll(regex, "");
System.out.println(str);
str = template2.replaceAll(regex, "");
System.out.println(str);
The output should be:
Hello:%s
Hello
I am missing something in my regex. Any clues? Thanks!
Use a negative lookbehind to achieve your goal:
String regex = "(?<!:)%s";
It matches %s only if there is not a : right before it.
I have a sentence: "we:PR show:V".
I want to match only those characters after ":" and before "\\s" using regex pattern matcher.
I used following pattern:
Pattern pattern=Pattern.compile("^(?!.*[\\w\\d\\:]).*$");
But it did not work.
What is the best pattern to get the output?
For a situation such as this, if you are using java, it may be easier to do something with substrings:
String input = "we:PR show:V";
String colon = ":";
String space = " ";
List<String> results = new ArrayList<String>();
int spaceLocation = -1;
int colonLocation = input.indexOf(colon);
while (colonLocation != -1) {
spaceLocation = input.indexOf(space);
spaceLocation = (spaceLocation == -1 ? input.size() : spaceLocation);
results.add(input.substring(colonLocation+1,spaceLocation);
if(spaceLocation != input.size()) {
input = input.substring(spaceLocation+1, input.size());
} else {
input = new String(); //reached the end of the string
}
}
return results;
This will be faster than trying to match on regex.
The following regex assumes that any non-whitespace characters following a colon (in turn preceded by non-colon characters) are a valid match:
[^:]+:(\S+)(?:\s+|$)
Use like:
String input = "we:PR show:V";
Pattern pattern = Pattern.compile("[^:]+:(\\S+)(?:\\s+|$)");
Matcher matcher = pattern.matcher(input);
int start = 0;
while (matcher.find(start)) {
String match = matcher.group(1); // = "PR" then "V"
// Do stuff with match
start = matcher.end( );
}
The pattern matches, in order:
At least one character that isn't a colon.
A colon.
At least non-whitespace character (our match).
At least one whitespace character, or the end of input.
The loop continues as long as the regex matches an item in the string, beginning at the index start, which is always adjusted to point to after the end of the current match.