How to extract values in between { } through regex? [duplicate] - java

This question already has answers here:
Java Regex matching between curly braces
(5 answers)
Closed 6 years ago.
I am trying to extract value between { } using
"\\(\\{[^}]+\\}\\)"
regex in java. My input is
String text = "Hi this is {text to be extracted}."
I want output as
"text to be extracted"
but that regex isn't working.

Try this:
"\\{([^}]*)\\}"
Online Demo
Then $1 is containing text to be extracted.

The regexp seems malformed.
You need to match extra characters before and after the group, and you do not need to escape the parenthesis.
Also, you can use the named group to extract exactly the text you care about
Here is working code
String text = "Hi this is {text to be extracted}.";
Pattern p = Pattern.compile(".*\\{(?<t>[^}]+)\\}.*");
Matcher m = p.matcher(text);
if (m.matches()) {
System.out.println(m.group("t"));
}

Related

Regex to find substring and wrap it in tags - Java [duplicate]

This question already has answers here:
Can I replace groups in Java regex?
(6 answers)
Closed 1 year ago.
I'm trying to write code which finds all words that are in the following format
"some text":
So alphanumeric characters which are within " symbols and which end with the : symbol. My goal once I find these is to remove the " character and wrap the entire string in tag. So after the code has run, the above text would look like this,
<strong>some text:</strong>
So I believe the regex to find such text for wrapping is the following (formatted for Java),
(\".*?\"):{1}
And by using the following code, I should be able to iterate through all the matches.
Pattern pattern = Pattern.compile("(\".*?\"):{1}");
Matcher matcher = pattern.matcher(stringToSearch);
while(matcher.find()) {
String matchedGroup = matcher.group();
matchedGroup = matchedGroup.replaceAll("\"", "");
matchedGroup = "<strong>" + matchedGroup + "</strong>";
// now what?
}
So I probably went about that all wrong.
Now that I've wrapped the word I wanted in a strong tag, how do I "put it back" where it was?
Assuming you expect your double quotes to always be balanced, you may just use String#replaceAll here for a more terse solution:
String input = "Here is \"some text\": and also \"some other text\":";
String output = input.replaceAll("\"(.*?)\":", "<strong>$1:</strong>");
System.out.println(output);
This prints:
Here is <strong>some text:</strong> and also <strong>some other text:</strong>

Using or '|' in regex [duplicate]

This question already has answers here:
Difference between matches() and find() in Java Regex
(5 answers)
Closed 5 years ago.
I am stuck in a simple issue I want to check if any of the words : he, be, de is present my text.
So I created the pattern (present in the code) using '|' to symbolize OR
and then I matched against my text. But the match is giving me false result (in print statement).
I tried to do the same match in Notepad++ using Regex search and it worked there but gives FALSE( no match) in Java. C
public class Del {
public static void main(String[] args) {
String pattern="he|be|de";
String text= "he is ";
System.out.println(text.matches(pattern));
}
}
Can any one suggest what am I doing wrong.
Thanks
It's because you are trying to match against the entire string instead of the part to find. For example, this code will find that only a part of the string is conforming to the present regex:
Matcher m = Pattern.compile("he|be|de").matcher("he is ");
m.find(); //true
When you want to match an entire string and check if that string contains he|be|de use this regex .*(he|be|de).*
. means any symbol, * is previous symbol may be present zero or more times.
Example:
"he is ".matches(".*(he|be|de).*"); //true
String regExp="he|be|de";
Pattern pattern = Pattern.compile(regExp);
String text = "he is ";
Matcher matcher = pattern.matcher(text);
System.out.println(matcher.find());

java split by bracket and keep the delmiter - RegEx [duplicate]

This question already has answers here:
How do I split a string in Java?
(39 answers)
Closed 6 years ago.
i am trying to split the string using regex with closing bracket as a delimiter and have to keep the bracket..
i/p String: (GROUP=test1)(GROUP=test2)(GROUP=test3)(GROUP=test4)
needed o/p:
(GROUP=test1)
(GROUP=test2)
(GROUP=test3)
(GROUP=test4)
I am using the java regex - "\([^)]*?\)" and it is throwing me the error..Below is the code I am using and when I try to get the group, its throwing the error..
Pattern splitDelRegex = Pattern.compile("\\([^)]*?\\)");
Matcher regexMatcher = splitDelRegex.matcher("(GROUP=test1)(GROUP=test2) (GROUP=test3)(GROUP=test4)");
List<String> matcherList = new ArrayList<String>();
while(regexMatcher.find()){
String perm = regexMatcher.group(1);
matcherList.add(perm);
}
any help is appreciated..Thanks
You simply forgot to put capturing parentheses around the entire regex. You are not capturing anything at all. Just change the regex to
Pattern splitDelRegex = Pattern.compile("(\\([^)]*?\\))");
^ ^
I tested this in Eclipse and got your desired output.
You could use
str.split(")")
That would return an array of strings which you would know are lacking the closing parentheses and so could add them back in afterwards. Thats seems much easier and less error prone to me.
You could try changing this line :
String perm = regexMatcher.group(1);
To this :
String perm = regexMatcher.group();
So you read the last found group.
I'm not sure why you need to split the string at all. You can capture each of the bracketed groups with a regex.
Try this regex (\\([a-zA-Z0-9=]*\\)). I have a capturing group () that looks for text that starts with a literal \\(, contains [a-zA-Z0-9=] zero or many times * and ends with a literal \\). This is a pretty loose regex, you could tighten up the match if the text inside the brackets will be predictable.
String input = "(GROUP=test1)(GROUP=test2)(GROUP=test3)(GROUP=test4)";
String regex = "(\\([a-zA-Z0-9=]*\\))";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
while(matcher.find()) { // find the next match
System.out.println(matcher.group()); // print the match
}
Output:
(GROUP=test1)
(GROUP=test2)
(GROUP=test3)
(GROUP=test4)

Java Regex does not match - groups [duplicate]

This question already has answers here:
Java Regex does not match
(6 answers)
Closed 9 years ago.
I know that this kind of questions are proposed very often, but
I can't figure out why this RegEx does not match.
I want to check if there is a "M" at the beginning of the line, or not.
Finally, i want the path at the end of the line.
This is why startsWith() doesn't fit my Needs.
line = "M 72208 70779 aab src\com\aut\testproject\TestDomainf1.java";
if (line.matches("^(M?)(.*)$")) {}
I've also tried the other way out:
Pattern p = Pattern.compile("(M?)");
Matcher m = datePatt.matcher(line);
if (m.matches()) {
System.out.println("yay!");
}
if (line.matches("(M?)(.*)")) {}
Thanks
Seems to be simple:
if (line.startsWith("M")) {
String[] tokens = line.split("\\s+");
String path = tokens[tokens.length - 1];
}

Java regex split string by comma but ignore quotes and also parentheses [duplicate]

This question already has answers here:
Java: splitting a comma-separated string but ignoring commas in quotes
(12 answers)
Closed 9 years ago.
I'm stuck with this regex.
So, I have input as:
"Crane device, (physical object)"(X1,x2,x4), not "Seen by research nurse (finding)", EntirePatellaBodyStructure(X1,X8), "Besnoitia wallacei (organism)", "Catatropis (organism)"(X1,x2,x4), not IntracerebralRouteQualifierValue, "Diospyros virginiana (organism)"(X1,x2,x4), not SuturingOfHandProcedure(X1)
and in the end I would like to get is:
"Crane device, (physical object)"(X1,x2,x4)
not "Seen by research nurse (finding)"
EntirePatellaBodyStructure(X1,X8)
"Besnoitia wallacei (organism)"
"Catatropis (organism)"(X1,x2,x4)
not IntracerebralRouteQualifierValue
"Diospyros virginiana (organism)"(X1,x2,x4)
not SuturingOfHandProcedure(X1)
I've tried regex
(\'[^\']*\')|(\"[^\"]*\")|([^,]+)|\\s*,\\s*
It works if I don't have a comma inside parentheses.
RegEx
(\w+\s)?("[^"]+"|\w+)(\(\w\d(,\w\d)*\))?
Java Code
String input = ... ;
Matcher m = Pattern.compile(
"(\\w+\\s)?(\"[^\"]+\"|\\w+)(\\(\\w\\d(,\\w\\d)*\\))?").matcher(input);
while(matcher.find()) {
System.out.println(matcher.group());
}
Output
"Crane device, (physical object)"(X1,x2,x4)
not "Seen by research nurse (finding)"
EntirePatellaBodyStructure(X1,X8)
not "Besnoitia wallacei (organism)"(X1,x2,x4)
not "Catatropis (organism)"(X1,x2,x4)
not IntracerebralRouteQualifierValue
not "Diospyros virginiana (organism)"(X1,x2,x4)
not SuturingOfHandProcedure(X1)
Don't use regexes for this. Write a simple parser that keeps track of the number of parentheses encountered, and whether or not you are inside quotes. For more information, see: RegEx match open tags except XHTML self-contained tags
Would this do what you need?
System.out.println(yourString.replaceAll(", not", "\nnot"));
Assuming that there is no possibility of nesting () within (), and no possibility of (say) \" within "", you can write something like:
private static final Pattern CUSTOM_SPLIT_PATTERN =
Pattern.compile("\\s*((?:\"[^\"]*\"|[(][^)]*[)]|[^\"(]+)+)");
private static final String[] customSplit(final String input) {
final List<String> ret = new ArrayList<String>();
final Matcher m = CUSTOM_SPLIT_PATTERN.matcher(input);
while(m.find()) {
ret.add(m.group(1));
}
return ret.toArray(new String[ret.size()]);
}
(disclaimer: not tested).

Categories