Need help in Pattern matching in java - java

I have input string as
String str = "IN Param - {Parameter|String}{Parameter|String} Out Param - {Parameter Label|String}{Parameter Label2|String}";
I should able to get
{Parameter|String}{Parameter|String}
from In Param and
{Parameter Label|String}{Parameter Label2|String}
from Out Param.
And again in In Param, I should be able to get Parameter and string. How is it possible in regular expression matching Java?

It is possible through groups
So the regex is:
"\\{(.*?)\\|(.*?)\\}"
Group1 captures Parameter
Group2 captures String
In this regex {(.*?)| says match 0 to n characters that begins with { and ends with | and store the result in group1 excluding { and |..This happens similarly with |(.*?)} but it stores the result in group2..
try it here

Pattern p = Pattern.compile("\\{([^|]+)\\|([^}]+)\\}");
Matcher m = p.matcher(str);
while (m.find()) {
String label = m.group(1);
String value = m.group(2);
// do what you need with them
}

Related

No match for Java Regular Expression

I am running into an issue where my code is unable to find regex occurrences. Code:
String content = "This\ is\ an\ example.=This is an example\nThis\ is\ second\:=This is second"
String regex = "\"^.*(?=\\=)\"gm";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(content);
List<String> mKeys = new ArrayList<>();
while (m.find()) {
mKeys.add(m.group());
}
mKeys turns out to be empty. I have already validated my regex here https://regex101.com/r/YResRc/3. I am expecting the list to contain two keys from the content.
Your content contains no " quotes, and no text gm, so why would you expect that regex to match?
FYI: Syntaxes like "foo"gm or /foo/gm are something other languages do for regex literals. Java doesn't do that.
The g flag is implied by the fact that you're using a find() loop, and m is the MULTILINE flag that affects ^ and $ and you can specify that using the (?m) pattern, or by adding a second parameter to compile(), i.e. one of these ways:
Pattern p = Pattern.compile("foo", Pattern.MULTILINE);
Pattern p = Pattern.compile("(?m)foo");
Your regex should simply be:
(?m)^.*(?==)
which means: Match everything from the beginning of a line up to the last = sign on the line.
Test
String content = "This is an example.=This is an example\nThis is second:=This is second";
String regex = "(?m)^.*(?==)";
Matcher m = Pattern.compile(regex).matcher(content);
List<String> mKeys = new ArrayList<>();
while (m.find()) {
mKeys.add(m.group());
}
System.out.println(mKeys);
Output
[This is an example., This is second:]

A sample regular expression

I have sample content string repeated in a file which I wanna to retrieve its double value from it.the string content is "(AIC)|234.654 |" which I wanna retrieve the 234.654 from that...the "(AIC)|" is always fixed but the numbers change in other occasions so I am using regular expression as follow..but it says there is no match using below expression..any help would be appreciated
String contents="(AIC)|234.654 |";
Pattern p = Pattern.compile("AIC\\u0029{1}\\u007C{1}\\d+u002E{1}\\d+");
Matcher m = p.matcher(contents);
boolean b = m.find();
String t=m.group();
The above expression doest find any match and throw exception..
Thanks for any help
Your code has several typos, but beside them, you say you need to match the number inside the brackets, but you are referring to the whole match with .group(). You need to set a capturing group to access that number with .group(1).
Here is a fixed code:
String content="(AIC)|234.654 |";
Pattern p = Pattern.compile("AIC\\)\\|(\\d+\\.\\d+)");
Matcher m = p.matcher(content);
if (m.find())
{
System.out.println(m.group(1));
}
See IDEONE demo
If the number can be integer, just use an optional non-capturing group around the decimal part: Pattern.compile("AIC\\)\\|(\\d+(?:\\.\\d+)?)");
I think this regex should do the work:
(?<=\|)[\d\.]*(?=\s*\|)
It will only match digits and dots after a | and before an optional space and another |
And the complete code:
String content="(AIC)|234.654 |";
Pattern p = Pattern.compile("(?<=\\|)[\\d\\.]*(?=\\s*\\|)");
Matcher m = p.matcher(content);
boolean b = m.find();
String t=m.group();

text wrongly matchs with sub string of words in group

I want to check the text to see if it starts with what or who and and is a question type, so for that I wrote the following code:
private static void startWithQOrIf(String commentstr){
String urlPattern = "(|who|what).*\\?.*$";
Pattern p = Pattern.compile(urlPattern,Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(commentstr);
if (m.find()) {
System.out.println("yes");
}
}
everything works good but for example when I try:
whooooooooo is the follower?
will match as well but should not because I am looking for who not whooooooooo
Any idea?
You can ensure a whole word using a word boundary \b:
(|who|what)\\b.*\\?.*$
^^
If the words in the alternation group are supposed to appear at the start of the string, you can just use matches and remove $ anchor:
String urlPattern = "(|who|what)\\b.*\\?.*";
Pattern p = Pattern.compile(urlPattern,Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(commentstr);
if (m.matches()) { // < - Here, matches is used
System.out.println("yes");
}
Note that (|who|what) matches either an empty string, or who, or what. If you do not plan to allow empty string, use just (who|what).
You must use word boundaries.
String urlPattern = "\\b(who|what)\\b.*\\?.*$";

get the pattern before a given word

below is my text :
12,7 C84921797-6 Provisoirement, 848,80 smth
i want to extract the value 848,80 with the float pattern : [-+]?[0-9]*\\,?[0-9]+
but the code i am using extracts only the first value matching the pattern which is 12,7
this is my method :
String display(String pattern , String result){
String value= null
Pattern p = Pattern.compile(pattern);//compiles the pattern
Matcher matcher = p.matcher(result);//check if the result contains the pattern
if(matcher.find()) {
//get the first value found corresponding to the pattern
value = matcher.group(0)
}
return value
}
when i call this method :
String val=display("[-+]?[0-9]*\\,?[0-9]+" ," 12,7 C84921797-6 Provisoirement, 848,80 smth" )
println("val---"+val)
OUTPUT :
val---12,7
i want to use the word smth after the value to extract the correct value how can i proceed ?
You can add smth in your regex after part you are interested in. Just place interesting part in parenthesis to create group and refer to part matched by this group via Matchers group(id) method like
Pattern p = Pattern.compile("([-+]?[0-9]*\\,?[0-9]+)\\s+smth");
Matcher matcher = p.matcher(result);
if(matcher.find())
{
value = matcher.group(1); //get the first value found corresponding to the pattern
}
Other method would be using look-ahead to test if after part you are interested in exists smth. So your regex could look like
Pattern p = Pattern.compile("[-+]?[0-9]*\\,?[0-9]+(?=\\s+smth)");
Thanks to fact that look-ahead is zero-length it will not be included in match so you can use group(0) or simpler group() from Matcher to get result you want.
([\\d\\,]+) smth
With this $1 matches the float number you wanted
If you always have smth (note one whitespace) after your number representation, try this:
String input = "12,7 C84921797-6 Provisoirement, 848,80 smth";
// | optional sign
// | | number 1st part
// | | | optional comma, more digits part
// | | | | lookahead for " smth"
Pattern p = Pattern.compile("[-+]?\\d+(,\\d+)*(?=\\ssmth)");
Matcher m = p.matcher(input);
if (m.find()) {
System.out.println("Found --> " + m.group());
}
Output
Found --> 848,80
Short and simple:
Pattern p = Pattern.compile("\\s+\\d+,\\d+");
http://fiddle.re/n17np

Java Regex: how to capture multiple matches in the same line

I am trying to match a regex pattern in Java, and I have two questions:
Inside the pattern I'm looking for there is a known beginning and then an unknown string that I want to get up until the first occurrence of an &.
there are multiple occurrences of these patterns in the line and I would like to get each occurrence separately.
For example I have this input line:
1234567 100,110,116,129,139,140,144,146 http://www.gold.com/shc/s/c_10153_12605_Computers+%26+Electronics_Televisions?filter=Screen+Refresh+Rate%7C120HZ%5EScreen+Size%7C37+in.+to+42+in.&sName=View+All&viewItems=25&subCatView=true ISx20070515x00001a http://www.gold.com/shc/s/c_10153_12605_Computers+%26+Electronics_Televisions?filter=Screen+Refresh+Rate%7C120HZ&sName=View+All&subCatView=true 0 2819357575609397706
And I am interested in these strings:
Screen+Refresh+Rate%7C120HZ%5EScreen+Size%7C37+in.+to+42+in.
Screen+Refresh+Rate%7C120HZ
Assuming the known beginning is filter=**, the regular expression pattern (?:filter=\\*\\*)(.*?)(?:&) should get you what you need. Use Matcher.find() to get all occurrences of the pattern in a given string. Using the test string you provided, the following:
final Pattern p = Pattern.compile("(?:filter=\\*\\*)(.*?)(?:&)");
final Matcher m = p.matcher(testString);
int cnt = 0;
while (m.find()) {
System.out.println(++cnt + ": G1: " + m.group(1));
}
Will output:
1: G1: Screen+Refresh+Rate%7C120HZ%5EScreen+Size%7C37+in.+to+42+in.
2: G1: Screen+Refresh+Rate%7C120HZ**
If i know that I might need other query parameters in the future, I think it'll be more prudent to decode and parse the URL.
String url = URLDecoder.decode("http://www.gold.com/shc/s/c_10153_12605_" +
"Computers+%26+Electronics_Televisions?filter=Screen+Refresh+Rate" +
"%7C120HZ%5EScreen+Size%7C37+in.+to+42+in.&sName=View+All&viewItems=25&subCatView=true"
,"utf-8");
Pattern amp = Pattern.compile("&");
Pattern eq = Pattern.compile("=");
Map<String, String> params = new HashMap<String, String>();
String queryString = url.substring(url.indexOf('?') + 1);
for(String param : amp.split(queryString)) {
String[] pair = eq.split(param);
params.put(pair[0], pair[1]);
}
for(Entry<String, String> param : params.entrySet()) {
System.out.format("%s = %s\n", param.getKey(), param.getValue());
}
Output
subCatView = true
viewItems = 25
sName = View All
filter = Screen Refresh Rate|120HZ^Screen Size|37 in. to 42 in.
in your example, there is sometimes a "**" at the end before the "&". but basically, (assuming "filter=" is the start pattern you are looking for) you want something like:
"filter=([^&]+)&"
Using the regular expression (?<=filter=\*{0,2})[^&]*[^&*]+ in java:
Pattern p = Pattern.compile("(?<=filter=\\*{0,2})[^&]*[^&*]+");
String s = "1234567 100,110,116,129,139,140,144,146 http://www.gold.com/shc/s/c_10153_12605_Computers+%26+Electronics_Televisions?filter=**Screen+Refresh+Rate%7C120HZ%5EScreen+Size%7C37+in.+to+42+in.&sName=View+All**&viewItems=25&subCatView=true ISx20070515x00001a http://www.gold.com/shc/s/c_10153_12605_Computers+%26+Electronics_Televisions?filter=**Screen+Refresh+Rate%7C120HZ**&sName=View+All&subCatView=true 0 2819357575609397706";
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println(m.group());
}
EDIT:
Added [^&*]+ to the end of the regex to prevent the ** from being included in the second match.
EDIT2:
Changed regular expression to use lookbehind.
The regex you're looking for is
Screen\+Refresh\+Rate[^&]*
You could use Matcher.find() to find all matches.
are you looking for a string that follows with "filter=" and ignores the first "*" and is end with the first "&".
your can try the following:
String str = "1234567 100,110,116,129,139,140,144,146 http://www.gold.com/shc/s/c_10153_12605_Computers+%26+Electronics_Televisions?filter=**Screen+Refresh+Rate%7C120HZ%5EScreen+Size%7C37+in.+to+42+in.&sName=View+All**&viewItems=25&subCatView=true ISx20070515x00001a http://www.gold.com/shc/s/c_10153_12605_Computers+%26+Electronics_Televisions?filter=**Screen+Refresh+Rate%7C120HZ**&sName=View+All&subCatView=true 0 2819357575609397706";
Pattern p = Pattern.compile("filter=(?:\\**)([^&]+?)(?:\\**)&");
Matcher matcher = p.matcher(str);
while(matcher.find()){
System.out.println(matcher.group(1));
}

Categories