%etd(msg01) regular expression? - java

I am trying to write a regular expression for String like %etd(msg01).
String string = "My name is %etd(msg01) and %etd(msg02)";
Pattern pattern = Pattern.compile("%etd(.+)");
Matcher matcher = pattern.matcher(string);
while(matcher.find()) {
System.out.println(matcher.group());
}
It prints %etd(msg01) and %etd(msg02). However, I want it to print %etd(msg01) %etd(msg02) separately. I mean I am looking for non-greedy match.
How should the regular expression be changed to make it non greedy in this situation?

You should use this regex:
Pattern pattern = Pattern.compile("%etd\\([^)]+\\)");

Please place a question mark after .* or .+ to make it nongreedy. This should work for you...
Pattern pattern = Pattern.compile("%etd\\(.+?\\)");
Double slashes are also necessary in front of open and close parenthesis because they carry a special meaning in regular expression.
Another way of using is as below if you are sure that your names doesn't contain an open paranthesis after the first one.
Pattern pattern = Pattern.compile("%etd\\([^(]+\\)");

Related

Regex in java to extract specific pattern

I want to match the pattern (including the square brackets, equals, quotes)
[fixedtext="sometext"]
What would be a correct regex expression?
Anything can occur inside quotes. 'fixedtext' is fixed.
Your basic solution (although I'd be skeptical of this, per the comments) is essentially:
"\\[fixedtext=\\\"(.*)\\\"\\]"
which resolves to:
"\[fixedtext=\"(.*)\"\]"
Simple escaping of [] and quotes. The (.*) says capture everything in quotes as a capture group (matcher.group(1)).
But if you had a string of, for example '[fixedtext="abc\"]def"]' you'd get the an answer of abc\ instead of abc\"]def.
If you know the ending bracket ends the line, then use:
"\\[fixedtext=\\\"(.*)\\\"\\]$"
(add the $ at the end to mark end of line) and that should be fairly reliable.
My suggestion is using named-capturing groups.
You can find more details here:
https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html
Here's an example for your input:
String input = "[fixedtext=\"sometext\"]";
Pattern pattern = Pattern.compile("\\[(?<field>.*)=\"(?<value>.*)\"]");
Matcher matcher = pattern.matcher(input);
if (matcher.matches()) {
System.out.println(matcher.group("field"));
System.out.println(matcher.group("value"));
} else {
System.err.println(input + " doesn't match " + pattern);
}

How to get just nested bracket in regex

I'm using Java and I would like to implement a code whose output is PRP I when the input is (NP (PRP I)).
My current implementation is like the following:
Pattern pattern = Pattern.compile("\\((.?)\\)");
Matcher matcher = pattern.matcher(noun_phrase);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
and its output is NP (PRP I.
I know that one possibility would be to count the parentheses, but I'm wondering if there is any way to get just the string inside the nested parentheses using regex.
This should work
Pattern pattern = Pattern.compile("\\(.*?\\((.*?)\\)\\)");
Matcher matcher = pattern.matcher("(NP (PRP I))");
while (matcher.find()) {
System.out.println(matcher.group(1));
}
You can use following sites to experiment with Regular expressions.
https://regex101.com/r/cE0dM7/1
http://leaverou.github.io/regexplained/
https://www.debuggex.com/r/gfVglXkY1Cw5D6Mb
You need to add another braces around the group. Also, you need to make sure that between the fixed parentheses you don't match the parentheses:
String noun_phrase = "(NP (PRP I))";
Pattern pattern = Pattern.compile("\\([^(]*\\(([^)]*)\\)[^)]*\\)");
Matcher matcher = pattern.matcher(noun_phrase);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
The negated character classes [^(] and [^)] make sure you don't match parentheses too eagerly.
Well, as I don't know how deep you can go with your parantheses, I will suggest two possible solutions.
Solution 1: Assuming the depth's exactly as in your question.
This regex will work: Pattern pattern = Pattern.compile("\\(([^()]*)\\)").
Solution 2: Assuming the depths arbitrary (but at least the most inner string is surrounded by parantheses).
In this case, you will have to make some more changes. First, your pattern will look like this: Pattern pattern = Pattern.compile("(\\(.*)*\\(([^)]*)\\)"). See the difference? You now have two groups, the first matching on all but the innermost part surrounded by parantheses, the second group is exactly the one you want. That means, in your loop, you have to change matcher.group(1) to matcher.group(2). Furthermore, [^)] makes sure, you don't have any closing parantheses in your group.

java regex - not able to retrieve contents from first square brackets

I am pretty new to regular expressions stuff.. I have this requirement of picking up contents in the first square brackets. For e.g. if I have the string like "PORT-OTEF_RA2/6 [Eh0001/001-06] [ignore, test port]",
I need the result as "Eh0001/001-06".
I am using following regular expression.
Pattern pattern =
Pattern.compile("^PORT.+\\[(.*?)\\]");
Matcher matcher =
pattern.matcher("PORT-OTEF_RA2/6 [Eh0001/001-06] [ignore, test port]");
if(matcher.find()){
System.out.println(matcher.group(1));
}
but I always get the contents of second square brackets.
However, if I give the regular expression as
Pattern.compile("\\[(.*?)\\]");
I get the required answer. But I need to make sure the string starts with "PORT". Can someone light me on where I am going wrong.
Use non-greedy regex after PORT:
^PORT.+?\\[(.*?)\\]
Otherwise .+ will be greedy and match till last [...] is found.
RegEx Demo

Java Regular expressions for filename

I want to check the filenames sent to me against two patterns.
The first regular expression is ~*~, which should match names like ~263~. I put this in online regular expression testers and it matches. The code doesnt work though. Says no match
List<FTPFile> ret = new ArrayList<FTPFile>();
Pattern pattern = Pattern.compile("~*~");
Matcher matcher;
for (FTPFile file : files)
{
matcher = pattern.matcher(file.getName());
if(matcher.matches())
{
ret.add(file);
}
}
return ret;
Also the second pattern I need is ##* which should match strings like abc#ere#sss
Please tell me the proper patterns in java for this.
You need to define your pattern like,
Pattern pattern = Pattern.compile("~.*~");
~* in your regex ~*~ will repeat the first ~ zero or more times. So it won't match the number following the first ~. Because matches method tries to match the whole input string, this regex causes the match to fail. So you need to add .* inbetween to match strings like ~66~ or ~kjk~ . To match the strings which has only numbers present inbetween ~, you need to use ~\d+~
Try Regex:
\~.*\~
Instead:
~*~
Example:
Pattern pattern = Pattern.compile("\\~.*\\~");

Regular expression to find substring in text

I have a text file contains some strings I want to extract with Java regex,
Those strings are in format of:
$numbers,numbers,numbers....,numbers##
(start with $, followed by groups of numbers plus ,, and end with ##)
Here is my pattern.
Pattern pattern = Pattern.compile("$*##");
Matcher matcher = pattern.matcher(text);
if (matcher.find())
{
}
It turns out that nothing match my pattern
Can anyone tell me what's wrong with it?
You need to do:
Pattern pattern = Pattern.compile("\\$\\$\\d+(,\\d+)*##$");
Thanks to #Pshemo for his valuable inputs to reach the solution.

Categories