${} - Regex Expression - java

I really tried to learn Regex expressions correctly, but, it really blows my mind when I need to build one of them. It`s painful and I lost several hours to build them.
So, I need the community help. I`ve an XML String, and I want to build a Regex pattern to identify any occurence of:
${Variable1}
${VARIABLE_TEST}
etc. So, anything that starts with ${ and ends with }.
Could anyone help-me?

Try with following regex:
\${([^}]+)}
Explanation:
\${ - starts with ${ (we have to escape special character $)
([^}]+) - match everything that is not }
} - ending character
demo
Regex with escaped { and }:
\$\{([^}]+)\}

try this
Matcher m = Pattern.compile("\\$\\{(.*?)}").matcher(s);
while(m.find()) {
System.out.println(m.group(1));
}

In JavaScript
const regex = /\${([^}]*)}/g;
const str = " Hello ${Variable1}, Well How are you ${VARIABLE TEST} It empty ${} It's numeric ${123\$\$\$}, numeric plus $pecial char ${#123123} ${\$%!##}";
console.log(str.match(regex));
In Java Demo
final String regex = "\\$\\{([^\\}]*)\\}";
Matcher m = Pattern.compile(regex).matcher(text);
while(m.find()) {
System.out.println(m.group(1));
}

Related

how to exclude "<" in regex match

I have a String which looks like "<name><address> and <Phone_1>". I have get to get the result like
1) <name>
2) <address>
3) <Phone_1>
I have tried using regex "<(.*)>" but it returns just one result.
The regex you want is
<([^<>]+?)><([^<>]+?)> and <([^<>]+?)>
Which will then spit out the stuff you want in the 3 capture groups. The full code would then look something like this:
Matcher m = Pattern.compile("<([^<>]+?)><([^<>]+?)> and <([^<>]+?)>").matcher(string);
if (m.find()) {
String name = m.group(1);
String address = m.group(2);
String phone = m.group(3);
}
The pattern .* in a regex is greedy. It will match as many characters as possible between the first < it finds and the last possible > it can find. In the case of your string it finds the first <, then looks for as much text as possible until a >, which it will find at the very end of the string.
You want a non-greedy or "lazy" pattern, which will match as few characters as possible. Simply <(.+?)>. The question mark is the syntax for non-greedy. See also this question.
This will work if you have dynamic number of groups.
Pattern p = Pattern.compile("(<\\w+>)");
Matcher m = p.matcher("<name><address> and <Phone_1>");
while (m.find()) {
System.out.println(m.group());
}

find substring using match regex

Using regex how to find a substring in other string. Here are two strings:
String a= "?drug <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/possibleDiseaseTarget> ?disease .";
String b = "?drug <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/molecularWeightAverage> ?weight . ?drug <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/possibleDiseaseTarget> ?disease";
I want to match only
<http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/possibleDiseaseTarget>
Since this is not quite HTML and any XML/HTML parser couldn't help it you can try with regex. It seems that you want to find text in form
?drug <someData> ?disease
To describe such text regex you need to escape ? (it is one of regex special characters representing optional - zero or once - quantifier) so you need to place \ before it (which in String needs to be written as "\\").
Also part <someData> can be written as as <[^>]> which means,
<,
one or more non > after it,
and finally >
So regex to match ?drug <someData> ?disease can be written as
"\\?drug <[^>]+> \\?disease"
But since we are interested only in part <[^>]+> representing <someData> we need to let regex group founded contend. In short if we surround some part of regex with parenthesis, then string matched by this regex part will be placed in something we call group, so we will be able to get part from this group. In short final regex can look like
"\\?drug (<[^>]+>) \\?disease"
^^^^^^^^^---first group,
and can be used like
String a = "?drug <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/possibleDiseaseTarget> ?disease .";
String b = "?drug <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/molecularWeightAverage> ?weight . ?drug <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/possibleDiseaseTarget> ?disease";
Pattern p = Pattern.compile("\\?drug (<[^>]+>) \\?disease");
Matcher m = p.matcher(a);
while (m.find()) {
System.out.println(m.group(1));
}
System.out.println("-----------");
m = p.matcher(b);
while (m.find()) {
System.out.println(m.group(1));
}
which will produce as output
<http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/possibleDiseaseTarget>
-----------
<http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/possibleDiseaseTarget>
There's no need to use a regex here, just do this :
String substr = "<http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/possibleDiseaseTarget>";
System.out.println(b.contains(substr)); // prints true
System.out.println(a.contains(substr)); // prints true

Remove occurrences of a given character sequence at the beginning of a string using Java Regex

I have a string that begins with one or more occurrences of the sequence "Re:". This "Re:" can be of any combinations, for ex. Re<any number of spaces>:, re:, re<any number of spaces>:, RE:, RE<any number of spaces>:, etc.
Sample sequence of string : Re: Re : Re : re : RE: This is a Re: sample string.
I want to define a java regular expression that will identify and strip off all occurrences of Re:, but only the ones at the beginning of the string and not the ones occurring within the string.
So the output should look like This is a Re: sample string.
Here is what I have tried:
String REGEX = "^(Re*\\p{Z}*:?|re*\\p{Z}*:?|\\p{Z}Re*\\p{Z}*:?)";
String INPUT = title;
String REPLACE = "";
Pattern p = Pattern.compile(REGEX);
Matcher m = p.matcher(INPUT);
while(m.find()){
m.appendReplacement(sb,REPLACE);
}
m.appendTail(sb);
I am using p{Z} to match whitespaces(have found this somewhere in this forum, as Java regex does not identify \s).
The problem I am facing with this code is that the search stops at the first match, and escapes the while loop.
Try something like this replace statement:
yourString = yourString.replaceAll("(?i)^(\\s*re\\s*:\\s*)+", "");
Explanation of the regex:
(?i) make it case insensitive
^ anchor to start of string
( start a group (this is the "re:")
\\s* any amount of optional whitespace
re "re"
\\s* optional whitespace
: ":"
\\s* optional whitespace
) end the group (the "re:" string)
+ one or more times
in your regex:
String regex = "^(Re*\\p{Z}*:?|re*\\p{Z}*:?|\\p{Z}Re*\\p{Z}*:?)"
here is what it does:
see it live here
it matches strings like:
\p{Z}Reee\p{Z: or
R\p{Z}}}
which make no sense for what you try to do:
you'd better use a regex like the following:
yourString.replaceAll("(?i)^(\\s*re\\s*:\\s*)+", "");
or to make #Doorknob happy, here's another way to achieve this, using a Matcher:
Pattern p = Pattern.compile("(?i)^(\\s*re\\s*:\\s*)+");
Matcher m = p.matcher(yourString);
if (m.find())
yourString = m.replaceAll("");
(which is as the doc says the exact same thing as yourString.replaceAll())
Look it up here
(I had the same regex as #Doorknob, but thanks to #jlordo for the replaceAll and #Doorknob for thinking about the (?i) case insensitivity part ;-) )

Regular expression to get characters before brackets or comma

I'm pulling my hair out a bit with this.
Say I have a string 7f8hd::;;8843fdj fls "": ] fjisla;vofje]]} fd)fds,f,f
I want to now extract this 7f8hd::;;8843fdj fls "": from the string based on the premise that the string ends with either a } or ] or , or ) but all those characters could be present I only need the first one.
I have tried without success to create a regular expression with a Matcher and Pattern class but I just can't seem to get it right.
The best I could come up with is below but my reg exp just doesn't seem to work like I think it should.
String line = "7f8hd::;;8843fdj fls "": ] fjisla;vofje]]} fd)fds,f,f";
Matcher m = Pattern.compile("(.*?)\\}|(.*?)\\]|(.*?)\\)|(.*?),").matcher(line);
while (matcher.find()) {
System.out.println(matcher.group());
}
I'm clearly not understanding reg exp correctly. Any help would be great.
^[^\]}),]*
matches from the start of the string until (but excluding) the first ], }, ) or ,.
In Java:
Pattern regex = Pattern.compile("^[^\\]}),]*");
Matcher regexMatcher = regex.matcher(line);
if (regexMatcher.find()) {
System.out.println(regexMatcher.group());
}
(You can actually remove the backslashes ([^]}),]), but I like to keep them there for clarity and for compatibility since not all regex engines recognize that idiom.)
Explanation:
^ # Match the start of the string
[^\]}),]* # Match zero or more characters except ], }, ) or ,
you could just cut the rest part by replaceAll:
String newStr = yourStr.replaceAll("[\\])},].*", "");
or by split() and get the first element.
String newStr = yourStr.split("[\\])},]")[0];
You can use this (as java string):
"(.+?)[\\]},)].*"
here is a fiddle
Could you try the regular expression (.*?)[}\]),](.*?) I tested it on rubular and worked against your example.

regex pattern - extract a string only if separated by a hyphen

I've looked at other questions, but they didn't lead me to an answer.
I've got this code:
Pattern p = Pattern.compile("exp_(\\d{1}-\\d)-(\\d+)");
The string I want to be matched is: exp_5-22-718
I would like to extract 5-22 and 718. I'm not too sure why it's not working What am I missing? Many thanks
Try this one:
Pattern p = Pattern.compile("exp_(\\d-\\d+)-(\\d+)");
In your original pattern you specified that second number should contain exactly one digit, so I put \d+ to match as more digits as we can.
Also I removed {1} from the first number definition as it does not add value to regexp.
If the string is always prefixed with exp_ I wouldn't use a regular expression.
I would:
replaceFirst() exp_
split() the resulting string on -
Note: This answer is based on the assumptions. I offer it as a more robust if you have multiple hyphens. However, if you need to validate the format of the digits then a regular expression may be better.
In your regexp you missed required quantifier for second digit \\d. This quantifier is + or {2}.
String yourString = "exp_5-22-718";
Matcher matcher = Pattern.compile("exp_(\\d-\\d+)-(\\d+)").matcher(yourString);
if (matcher.find()) {
System.out.println(matcher.group(1)); //prints 5-22
System.out.println(matcher.group(2)); //prints 718
}
You can use the string.split methods to do this. Check the following code.
I assume that your strings starts with "exp_".
String str = "exp_5-22-718";
if (str.contains("-")){
String newStr = str.substring(4, str.length());
String[] strings = newStr.split("-");
for (String string : strings) {
System.out.println(string);
}
}

Categories