get the pattern before a given word - java

below is my text :
12,7 C84921797-6 Provisoirement, 848,80 smth
i want to extract the value 848,80 with the float pattern : [-+]?[0-9]*\\,?[0-9]+
but the code i am using extracts only the first value matching the pattern which is 12,7
this is my method :
String display(String pattern , String result){
String value= null
Pattern p = Pattern.compile(pattern);//compiles the pattern
Matcher matcher = p.matcher(result);//check if the result contains the pattern
if(matcher.find()) {
//get the first value found corresponding to the pattern
value = matcher.group(0)
}
return value
}
when i call this method :
String val=display("[-+]?[0-9]*\\,?[0-9]+" ," 12,7 C84921797-6 Provisoirement, 848,80 smth" )
println("val---"+val)
OUTPUT :
val---12,7
i want to use the word smth after the value to extract the correct value how can i proceed ?

You can add smth in your regex after part you are interested in. Just place interesting part in parenthesis to create group and refer to part matched by this group via Matchers group(id) method like
Pattern p = Pattern.compile("([-+]?[0-9]*\\,?[0-9]+)\\s+smth");
Matcher matcher = p.matcher(result);
if(matcher.find())
{
value = matcher.group(1); //get the first value found corresponding to the pattern
}
Other method would be using look-ahead to test if after part you are interested in exists smth. So your regex could look like
Pattern p = Pattern.compile("[-+]?[0-9]*\\,?[0-9]+(?=\\s+smth)");
Thanks to fact that look-ahead is zero-length it will not be included in match so you can use group(0) or simpler group() from Matcher to get result you want.

([\\d\\,]+) smth
With this $1 matches the float number you wanted

If you always have smth (note one whitespace) after your number representation, try this:
String input = "12,7 C84921797-6 Provisoirement, 848,80 smth";
// | optional sign
// | | number 1st part
// | | | optional comma, more digits part
// | | | | lookahead for " smth"
Pattern p = Pattern.compile("[-+]?\\d+(,\\d+)*(?=\\ssmth)");
Matcher m = p.matcher(input);
if (m.find()) {
System.out.println("Found --> " + m.group());
}
Output
Found --> 848,80

Short and simple:
Pattern p = Pattern.compile("\\s+\\d+,\\d+");
http://fiddle.re/n17np

Related

Regular Expression in Java. Splitting a string using pattern and matcher

I am trying to get all the matching groups in my string.
My regular expression is "(?<!')/|/(?!')". I am trying to split the string using regular expression pattern and matcher. string needs to be split by using /, but '/'(surrounded by ') this needs to be skipped. for example "One/Two/Three'/'3/Four" needs to be split as ["One", "Two", "Three'/'3", "Four"] but not using .split method.
I am currently the below
// String to be scanned to find the pattern.
String line = "Test1/Test2/Tt";
String pattern = "(?<!')/|/(?!')";
// Create a Pattern object
Pattern r = Pattern.compile(pattern);
// Now create matcher object.
Matcher m = r.matcher(line);
if (m.matches()) {
System.out.println("Found value: " + m.group(0) );
} else {
System.out.println("NO MATCH");
}
But it always saying "NO MATCH". where i am doing wrong? and how to fix that?
Thanks in advance
To get the matches without using split, you might use
[^'/]+(?:'/'[^'/]*)*
Explanation
[^'/]+ Match 1+ times any char except ' or /
(?: Non capture group
'/'[^'/]* Match '/' followed by optionally matching any char except ' or /
)* Close group and optionally repeat it
Regex demo | Java demo
String regex = "[^'/]+(?:'/'[^'/]*)*";
String string = "One/Two/Three'/'3/Four";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println(matcher.group(0));
}
Output
One
Two
Three'/'3
Four
Edit
If you do not want to split don't you might also use a pattern to not match / but only when surrounded by single quotes
[^/]+(?:(?<=')/(?=')[^/]*)*
Regex demo
Try this.
String line = "One/Two/Three'/'3/Four";
Pattern pattern = Pattern.compile("('/'|[^/])+");
Matcher m = pattern.matcher(line);
while (m.find())
System.out.println(m.group());
output:
One
Two
Three'/'3
Four
Here is simple pattern matching all desired /, so you can split by them:
(?<=[^'])\/(?=')|(?<=')\/(?=[^'])|(?<=[^'])\/(?=[^'])
The logic is as follows: we have 4 cases:
/ is sorrounded by ', i.e. `'/'
/ is preceeded by ', i.e. '/
/ is followed by ', i.e. /'
/ is sorrounded by characters other than '
You want only exclude 1. case. So we need to write regex for three cases, so I have written three similair regexes and used alternation.
Explanation of the first part (other two are analogical):
(?<=[^']) - positiva lookbehind, assert what preceeds is differnt frim ' (negated character class [^']
\/ - match / literally
(?=') - positiva lookahead, assert what follows is '\
Demo with some more edge cases
Try something like this:
String line = "One/Two/Three'/'3/Four";
String pattern = "([^/]+'/'\d)|[^/]+";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(line);
boolean found = false;
while(m.find()) {
System.out.println("Found value: " + m.group() );
found = true;
}
if(!found) {
System.out.println("NO MATCH");
}
Output:
Found value: One
Found value: Two
Found value: Three'/'3
Found value: Four

A sample regular expression

I have sample content string repeated in a file which I wanna to retrieve its double value from it.the string content is "(AIC)|234.654 |" which I wanna retrieve the 234.654 from that...the "(AIC)|" is always fixed but the numbers change in other occasions so I am using regular expression as follow..but it says there is no match using below expression..any help would be appreciated
String contents="(AIC)|234.654 |";
Pattern p = Pattern.compile("AIC\\u0029{1}\\u007C{1}\\d+u002E{1}\\d+");
Matcher m = p.matcher(contents);
boolean b = m.find();
String t=m.group();
The above expression doest find any match and throw exception..
Thanks for any help
Your code has several typos, but beside them, you say you need to match the number inside the brackets, but you are referring to the whole match with .group(). You need to set a capturing group to access that number with .group(1).
Here is a fixed code:
String content="(AIC)|234.654 |";
Pattern p = Pattern.compile("AIC\\)\\|(\\d+\\.\\d+)");
Matcher m = p.matcher(content);
if (m.find())
{
System.out.println(m.group(1));
}
See IDEONE demo
If the number can be integer, just use an optional non-capturing group around the decimal part: Pattern.compile("AIC\\)\\|(\\d+(?:\\.\\d+)?)");
I think this regex should do the work:
(?<=\|)[\d\.]*(?=\s*\|)
It will only match digits and dots after a | and before an optional space and another |
And the complete code:
String content="(AIC)|234.654 |";
Pattern p = Pattern.compile("(?<=\\|)[\\d\\.]*(?=\\s*\\|)");
Matcher m = p.matcher(content);
boolean b = m.find();
String t=m.group();

find the start of a matched region (java regex)

Suppose the string I am interested is similar to these num3.a, num4.b, etc.
(but I don't want it to match these foo.num3.a, whatever.num2.b)
I have this regex to match them Pattern p = Pattern.compile("[^\\.]\\bnum(\\d*)(?=\\.)";
Given this input string : (num3.a)
Matcher m = p.matcher("(num3.a)");
if (m.find())
System.out.println(m.start()); // This would print 0 rather than 1 WHY?
How do I change the code so it prints 1 instead? (because 1 is the index of n, which is the start of my interested pattern)
If you're interessted in num3.a you should expand your Group. The brackets indicate a group and can be used to address within your match.
[^\\.]\\b(num\\d*)(?=\\.)
then you can access the group with
start(0) and end(0)
Pattern p = Pattern.compile("\\b(num\\d*\\.a)");
String input = "fffffffffffff(num3.a)fffffffffffffffffsdfsdf";
Matcher m = p.matcher(input);
if (m.find())
{
System.out.println(m.start(0));
System.out.println(input.substring(m.start(0), m.end(0)));
}
will output
14
num3.a
The method Matcher#regionStart()
Reports the start index of this matcher's region.
This doesn't indicate the start of the match, only the start of the region that is checked for a match.
Use find() and start() to find the start of a match.
Now you changed your pattern. [^\\.] matches anything that is not a dot. A ( is neither of those, so it is matched. The ( is at index 0 in the given String.
Pattern p = Pattern.compile("\\,\\d*");
String inpu = "Hotel Class : 5,106936 ";
Matcher m = p.matcher(inpu);
if (m.find())
{
System.out.println(inpu.substring(m.start(0), m.end(0)));
}
The output is "5,106936"

How do I build a regex to match these `long` values?

How do I build a regular expression for a long data type in Java, I currently have a regex expression for 3 double values as my pattern:
String pattern = "(max=[0-9]+\\.?[0-9]*) *(total=[0-9]+\\.?[0-9]*) *(free=[0-9]+\\.?[0-9]*)";
I am constructing the pattern using the line:
Pattern a = Pattern.compile("control.avgo:", Pattern.CASE_INSENSITIVE);
I want to match the numbers following the equals signs in the example text below, from the file control.avgo.
max=259522560, total=39325696, free=17979640
What do I need to do to correct my code to match them?
Could it be that you actually need
Pattern a = Pattern.compile(pattern, Pattern.CASE_INSENSITIVE);
instead of
Pattern a = Pattern.compile("control.avgo:", Pattern.CASE_INSENSITIVE);
because your current code uses "control.avgo:" as the regex, and not the pattern you have defined.
You need to address several errors, including:
Your pattern specifies real numbers, but your question asks for long integers.
Your pattern omits the commas in the string being searched.
The first argument to Pattern.compile() is the regular expression, not the string being searched.
This will work:
String sPattern = "max=([0-9]+), total=([0-9]+), free=([0-9]+)";
Pattern pattern = Pattern.compile( sPattern, Pattern.CASE_INSENSITIVE );
String source = "control.avgo: max=259522560, total=39325696, free=17979640";
Matcher matcher = pattern.matcher( source );
if ( matcher.find()) {
System.out.println("max=" + matcher.group(1));
System.out.println("total=" + matcher.group(2));
System.out.println("free=" + matcher.group(3));
}
If you want to convert the numbers you find to a numeric type, use Long.valueOf( String ).
In case you only need to find any numerical preceded by "="...
String test = "3.control.avgo: max=259522560, total=39325696, free=17979640";
// looks for the "=" sign preceding any numerical sequence of any length
Pattern pattern = Pattern.compile("(?<=\\=)\\d+");
Matcher matcher = pattern.matcher(test);
// keeps on searching until cannot find anymore
while (matcher.find()) {
// prints out whatever found
System.out.println(matcher.group());
}
Output:
259522560
39325696
17979640

Need help in Pattern matching in java

I have input string as
String str = "IN Param - {Parameter|String}{Parameter|String} Out Param - {Parameter Label|String}{Parameter Label2|String}";
I should able to get
{Parameter|String}{Parameter|String}
from In Param and
{Parameter Label|String}{Parameter Label2|String}
from Out Param.
And again in In Param, I should be able to get Parameter and string. How is it possible in regular expression matching Java?
It is possible through groups
So the regex is:
"\\{(.*?)\\|(.*?)\\}"
Group1 captures Parameter
Group2 captures String
In this regex {(.*?)| says match 0 to n characters that begins with { and ends with | and store the result in group1 excluding { and |..This happens similarly with |(.*?)} but it stores the result in group2..
try it here
Pattern p = Pattern.compile("\\{([^|]+)\\|([^}]+)\\}");
Matcher m = p.matcher(str);
while (m.find()) {
String label = m.group(1);
String value = m.group(2);
// do what you need with them
}

Categories