Using regex get particular value - java

I have a response like below
2020 Aug 05 09:31:25.646515 arrisxg1v4 WPEWebProcess[22024]: [AAMP-PLAYER]NotifyBitRateChangeEvent :: bitrate:2800000 desc:BitrateChanged - Network adaptation width:1280 height:720 fps:25.000000 position:256.000000
From this using regex how I can retrieve position value only as Integer. Using java code I can get it using .split method But How I can get this value 256 using Regex?

Try this one-liner:
String positionStr = str.replaceAll("(?:(?!position:).)*(?:position:(\\d+))?.*", "$1");
Integer position = positionStr.isEmpty() ? null : new Integer(positionStr);
This regex matches the whole string, capturing the target position value in group 1 ((?:...) is a non-capturing group), and replaces the match (ie everything) with the captured group. This effectively deletes everything you don't want.
Conveniently, because the capture is optional (has a quantifier of ?), if the input does not have a position: value, the result is a blank string.
The negative lookahead (?!position:). prevents the dot running past our target. Without the negative lookahead, the first dot would consume the entire input.

You could try something like this:
Pattern pattern = Pattern.compile(".*position:(\\d+(\\.\\d+))");
Matcher matcher = pattern.matcher(input);
if (matcher.matches()) {
String position = matcher.group(1);
}
Basically what this expression says is:
Accept anything up until the word position
Accept a colon :
Accept 1 or more digits (0-9), optionally followed by a dot and 1 or more digits
Then by using matcher.group you take everything between parentheses starting at the 1st parenthesis from the left.

Related

can deal with the first line space when i use regex for polynomials

here is my code
String a = "X^5+2X^2+3X^3+4X^4";
String exp[]=a.split("(|\\+\\d)[xX]\\^");
for(int i=0;i<exp.length;i++) {
System.out.println("exp: "+exp[i]+" ");
}
im try to find the output which is 5,2,3,4
but instead i got this answer
exp:
exp:5
exp:2
exp:3
exp:4
i dont know where is the first line space come from, and i cannot find a will to get rid of that, i try to use others regex for this and also use compile,still can get rid of the first line, i try to use new string "X+X^5+2X^2+3X^3+4X^4";the first line shows exp:X.
and i also use online regex compiler to try my problem, but their answer is 5,2,3,4, buy eclipse give a space ,and then 5,2,3,4 ,need a help to figure this out
Try to use regex, e.g:
String input = "X^5+2X^2+3X^3+4X^4";
Pattern pattern = Pattern.compile("\\^([0-9]+)");
Matcher matcher = pattern.matcher(input);
for (int i = 1; matcher.find(); i++) {
System.out.println("exp: " + matcher.group(1));
}
It gives output:
exp: 5
exp: 2
exp: 3
exp: 4
How does it work:
Pattern used: \^([0-9]+)
Which matches any strings starting with ^ followed by 1 or more digits (note the + sign). Dash (^) is prefixed with backslash (\) because it has a special meaning in regular expressions - beginning of a string - but in Your case You just want an exact match of a ^ character.
We want to wrap our matches in a groups to refer to them late during matching process. It means we need to mark them using parenthesis ( and ).
Then we want to pu our pattern into Java String. In String literal, \character has a special meaning - it is used as a control character, eg "\n" represents a new line. It means that if we put our pattern into String literal, we need to escape a \ so our pattern becomes: "\\^([0-9]+)". Note double \.
Next we iterate through all matches getting group 1 which is our number match. Note that a ^.character is not covered in our match even if it is a part of our pattern. It is so because wr used parenthesis to mark our searched group, which in our case are only digits
Because you are using the split method which looks for the occurrence of the regex and, well.. splits the string at this position. Your string starts with X^ so it very much matches your regex.

Regex does not work will tailing quote

I have regex to remove the match the string that MAY start and end with quotes. So I created a regex to do this.
String str = "#TEST_ENV_TEST_VAR=\"value\"";
Pattern p = Pattern.compile("#TEST_ENV_(.*)=\"?(.*)\"?");
Matcher matcher = p.matcher(str);
matcher.find()
String key = matcher.group(2);
But when I check the key and string is value". It should be value right because we have added ? at the end.
I try using []? regex and also try with * but none work.
You do not need the last ? because it stops the greediness of (.*) and stops at the first "
#TEST_ENV_(.*)=\"?(.*)\"
Demo
Otherwise, if the goal is only to match the string between quotes, you could simply use positive lookaheads and lookbehinds
(?<=\").*(?=\")
Demo
This one will match your stuff correctly:
#TEST_ENV_([^=]*)=(\"?)([^\"]*)\2
This captures:
your #TEST_ENV_ string
anything not an equal sign, in the first group
the equal sign itself
an optional quote, as group #2
the contents of the quote, defined as "anything not a quote"
A back-reference to group 2, meaning it requires a bracket if the front has one.
Of course, the data will be in group #1 and group #3, since #2 is used for the brackets.
Test it yourself.

Regex to find all possible occurrences of text starting and ending with ~

I would like to find all possible occurrences of text enclosed between two ~s.
For example: For the text ~*_abc~xyz~ ~123~, I want the following expressions as matching patterns:
~*_abc~
~xyz~
~123~
Note it can be an alphabet or a digit.
I tried with the regex ~[\w]+?~ but it is not giving me ~xyz~. I want ~ to be reconsidered. But I don't want just ~~ as a possible match.
Use capturing inside a positive lookahead with the following regex:
Sometimes, you need several matches within the same word. For instance, suppose that from a string such as ABCD you want to extract ABCD, BCD, CD and D. You can do it with this single regex:
(?=(\w+))
At the first position in the string (before the A), the engine starts the first match attempt. The lookahead asserts that what immediately follows the current position is one or more word characters, and captures these characters to Group 1. The lookahead succeeds, and so does the match attempt. Since the pattern didn't match any actual characters (the lookahead only looks), the engine returns a zero-width match (the empty string). It also returns what was captured by Group 1: ABCD
The engine then moves to the next position in the string and starts the next match attempt. Again, the lookahead asserts that what immediately follows that position is word characters, and captures these characters to Group 1. The match succeeds, and Group 1 contains BCD.
The engine moves to the next position in the string, and the process repeats itself for CD then D.
So, use
(?=(~[^\s~]+~))
See the regex demo
The pattern (?=(~[^\s~]+~)) checks each position inside a string and searches for ~ followed with 1+ characters other than whitespace and ~ and then followed with another ~. Since the index is moved only after a position is checked, and not when the value is captured, overlapping substrings get extracted.
Java demo:
String text = " ~*_abc~xyz~ ~123~";
Pattern p = Pattern.compile("(?=(~[^\\s~]+~))");
Matcher m = p.matcher(text);
List<String> res = new ArrayList<>();
while(m.find()) {
res.add(m.group(1));
}
System.out.println(res); // => [~*_abc~, ~xyz~, ~123~]
Just in case someone needs a Python demo:
import re
p = re.compile(r'(?=(~[^\s~]+~))')
test_str = " ~*_abc~xyz~ ~123~"
print(p.findall(test_str))
# => ['~*_abc~', '~xyz~', '~123~']
Try this [^~\s]*
This pattern doesn't consider the characters ~ and space (refered to as \s).
I have tested it, it works on your string, here's the demo.

Match regex but only replace the first section - Java

I'm trying to take a phone number which can be in the format either +44 or +4 followed by any number of digits or hyphens, and replace the +44 or +4 with +44 or +4 followed by a space.
I believe I need a look around to match the full number but only replace the initial prefix, what I'm trying atm is
^[+]\d[0-9](?:([0-9]+))?
which matches the number (without hyphens) however I thought the lookahead would only match the number and not capture the extra digits however it seems to capture the whole thing.
Can anyone point me in the right direction as to what I've done wrong?
EDIT:
To be clearer my Java code is
Pattern pattern = Pattern.compile("^[+]\\d[0-9](?:([0-9]+))?");
if(pattern.matcher("+441234567890").matches())
String num = pattern.matcher(title).replaceFirst("$0 $1");
Thanks.
If you want to match whole number, but replace only part of it, you should not use positive lookahead, but just gruping, like in:
(^\+\d\d)([\d-]+)?
prefix will be in group 1, and the rest of number in group 2, so to add a space between these parts, just use something like group1 + space + group2.
In your example it should look like this:
Pattern pattern = Pattern.compile("(^\\+\\d\\d)([\\d-]+)?");
if(pattern.matcher("+441234567890").matches()) {
num = pattern.matcher(title).replaceFirst("$1 $2");
}
However this regex will always capture two digits in prefix, if you want to match +44 or +4 you should use:
(^\+(44|4))([\d-]+)?
so if you have more possible prefixes, you need to change this regex also.
You regex didn't work as you expected because (?:([0-9]+))? is a non capturing group, so the fragment matched by this part of regex was not captured, but it was still matched by whole regex. So $0 returned whole regex, and $1 should not return anything.

Java - Extract strings with Regex

I've this string
String myString ="A~BC~FGH~~zuzy|XX~ 1234~ ~~ABC~01/01/2010 06:30~BCD~01/01/2011 07:45";
and I need to extract these 3 substrings
1234
06:30
07:45
If I use this regex \\d{2}\:\\d{2} I'm only able to extract the first hour 06:30
Pattern depArrHours = Pattern.compile("\\d{2}\\:\\d{2}");
Matcher matcher = depArrHours.matcher(myString);
String firstHour = matcher.group(0);
String secondHour = matcher.group(1); (IndexOutOfBoundException no Group 1)
matcher.group(1) throws an exception.
Also I don't know how to extract 1234. This string can change but it always comes after 'XX~ '
Do you have any idea on how to match these strings with regex expressions?
UPDATE
Thanks to Adam suggestion I've now this regex that match my string
Pattern p = Pattern.compile(".*XX~ (\\d{3,4}).*(\\d{1,2}:\\d{2}).*(\\d{1,2}:\\d{2})";
I match the number, and the 2 hours with matcher.group(1); matcher.group(2); matcher.group(3);
The matcher.group() function expects to take a single integer argument: The capturing group index, starting from 1. The index 0 is special, which means "the entire match". A capturing group is created using a pair of parenthesis "(...)". Anything within the parenthesis is captures. Groups are numbered from left to right (again, starting from 1), by opening parenthesis (which means that groups can overlap). Since there are no parenthesis in your regular expression, there can be no group 1.
The javadoc on the Pattern class covers the regular expression syntax.
If you are looking for a pattern that might recur some number of times, you can use Matcher.find() repeatedly until it returns false. Matcher.group(0) once on each iteration will then return what matched that time.
If you want to build one big regular expression that matches everything all at once (which I believe is what you want) then around each of the three sets of things that you want to capture, put a set of capturing parenthesis, use Matcher.match() and then Matcher.group(n) where n is 1, 2 and 3 respectively. Of course Matcher.match() might also return false, in which case the pattern did not match, and you can't retrieve any of the groups.
In your example, what you probably want to do is have it match some preceding text, then start a capturing group, match for digits, end the capturing group, etc...I don't know enough about your exact input format, but here is an example.
Lets say I had strings of the form:
Eat 12 carrots at 12:30
Take 3 pills at 01:15
And I wanted to extract the quantity and times. My regular expression would look something like:
"\w+ (\d+) [\w ]+ (\d{1,2}:\d{2})"
The code would look something like:
Pattern p = Pattern.compile("\\w+ (\\d+) [\\w ]+ (\\d{2}:\\d{2})");
Matcher m = p.matcher(oneline);
if(m.matches()) {
System.out.println("The quantity is " + m.group(1));
System.out.println("The time is " + m.group(2));
}
The regular expression means "a string containing a word, a space, one or more digits (which are captured in group 1), a space, a set of words and spaces ending with a space, followed by a time (captured in group 2, and the time assumes that hour is always 0-padded out to 2 digits). I would give a closer example to what you are looking for, but the description of the possible input is a little vague.

Categories