regex find string between 2 characters, seperated by comma

regex find string between 2 characters, seperated by comma - java

I am new to regular expression and i want to find a string between two characters,
I tried below but it always returns false. May i know whats wrong with this ?
public static void main(String[] args) {
String input = "myFunction(hello ,world, test)";
String patternString = "\\(([^]]+)\\)";
Pattern pattern = Pattern.compile(patternString);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
System.out.println(matcher.group());
}
}
Input:
myFunction(hello,world,test) where myFunction can be any characters. before starting ( there can be any characters.
Output:
hello
world
test

You could match make use of the \G anchor which asserts the position at the end of the previous match and and capture your values in a group:
(?:\bmyFunction\(|\G(?!^))([^,]+)(?:\h*,\h*)?(?=[^)]*\))
In Java:
String regex = "(?:\\bmyFunction\\(|\\G(?!^))([^,]+)(?:\\h*,\\h*)?(?=[^)]*\\))";
Explanation
(?: Non capturing group
\bmyFunction\( Word boundary to prevent the match being part of a larger word, match myFunction and an opening parentheses (
| Or
\G(?!^) Assert position at the end of previous match, not at the start of the string
) Close non capturing group
([^,]+) Capture in a group matching 1+ times not a comma
(?:\h*,\h*)? Optionally match a comma surrounded by 0+ horizontal whitespace chars
(?=[^)]*\)) Positive lookahead, assert what is on the right is a closing parenthesis )
Regex demo | Java demo
For example:
String patternString = "(?:\\bmyFunction\\(|\\G(?!^))([^,]+)(?:\\h*,\\h*)?(?=[^)]*\\))";
String input = "myFunction(hello ,world, test)";
Pattern pattern = Pattern.compile(patternString);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
Result
hello
world
test

I'd suggest you to achieve this in a two-step process:
Step 1: Capture all the content between ( and )
Use the regex: ^\S+\((.*)\)$
Demo
The first and the only capturing group will contain the required text.
Step 2: Split the captured string above on ,, thus yielding all the comma-separated parameters independently.

See this you may get idea
([\w]+),([\w]+),([\w]+)
DEMO: https://rubular.com/r/9HDIwBTacxTy2O

Related

Extract sub-string from given String value using regex

I have a requirement where a string needs to be matched and then extract further value from a that string
I will receive a header in request whose value will be a DN name from ssl certificate. Here need to match a specific string 1.2.3.47 in the header and extract remaining text.
Sample String passed to method:
O=ABC Bank Plc/1.2.3.47=ABC12-PQR-121878, CN=7ltM2wQ3bqlDJdBEURGAMq, L=INDIA, C=INDIA, E=xyz#gmail.com
Here is my code:
private String extractDN(String dnHeader) {
if(!ValidatorUtil.isEmpty(dnHeader)){
String tokens[]=dnHeader.split(",");
if(tokens[0].contains("1.2.3.47")){
int index=tokens[0].lastIndexOf("1.2.3.47");
String id=tokens[0].substring(index+9);
System.out.println(id);
}
}
return id;
}
Can a regex pattern be used here to match and extract value? Is there any better way to achieve this? Please help.

If you want to use a pattern and if you know that the value always starts with a forward slash and if followed by one or more digits separated by a dot and then an equals sign, you could use a capturing group:
/[0-9](?:\\.[0-9]+)+=([^,]+)
/ Match /
[0-9]+ Match 1+ digit 0-9
(?: Non capturing group
\\.[0-9]+ match . and 1+ digits 0-9
)+ Close non capturing group and repeat 1+ times
= Match =
([^,]+) Capture group 1, match 1+ times any char except a ,
Regex demo | Java demo
For example
final String regex = "/[0-9]+(?:\\.[0-9]+)+=([^,]+)";
final String string = "O=ABC Bank Plc/1.2.3.47=ABC12-PQR-121878, CN=7ltM2wQ3bqlDJdBEURGAMq, L=INDIA, C=INDIA, E=xyz#gmail.com";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);
if (matcher.find()) {
System.out.println(matcher.group(1));
}
Output
ABC12-PQR-121878
If you want a more precise match, you could also specify the start of the pattern:
\\bO=\\w+(?:\\h+\\w+)*/[0-9]+(?:\\.[0-9]+)+=([^,]+)
Regex demo

How to write a regex capture group which matches a character 3 or 4 times before a delimiter?

I'm trying to write a regex that splits elements out according to a delimiter. The regex also needs to ensure there are ideally 4, but at least 3 colons : in each match.
Here's an example string:
"Checkers, etc:Blue::C, Backgammon, I say:Green::Pepsi:P, Chess, misc:White:Coke:Florida:A, :::U"
From this, there should be 4 matches:
Checkers, etc:Blue::C
Backgammon, I say:Green::Pepsi:P
Chess, misc:White:Coke:Florida:A
:::U
Here's what I've tried so far:
([^:]*:[^:]*){3,4}(?:, )
Regex 101 at: https://regex101.com/r/O8iacP/8
I tried setting up a non-capturing group for ,
Then I tried matching a group of any character that's not a :, a :, and any character that's not a : 3 or 4 times.
The code I'm using to iterate over these groups is:
String line = "Checkers, etc:Blue::C, Backgammon, I say::Pepsi:P, Chess:White:Coke:Florida:A, :::U";
String pattern = "([^:]*:[^:]*){3,4}(?:, )";
// Create a Pattern object
Pattern r = Pattern.compile(pattern);
// Now create matcher object.
Matcher matcher = r.matcher(line);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
Any help is appreciated!
Edit
Using #Casimir's regex, it's working. I had to change the above code to use group(0) like this:
String line = "Checkers, etc:Blue::C, Backgammon, I say::Pepsi:P, Chess:White:Coke:Florida:A, :::U";
String pattern = "(?![\\s,])(?:[^:]*:){3}\\S*(?![^,])";
// Create a Pattern object
Pattern r = Pattern.compile(pattern);
// Now create matcher object.
Matcher matcher = r.matcher(line);
while (matcher.find()) {
System.out.println(matcher.group(0));
}
Now prints:
Checkers, etc:Blue::C
Backgammon, I say::Pepsi:P
Chess:White:Coke:Florida:A
:::U
Thanks again!

I suggest this pattern:
(?![\\s,])(?:[^:]*:){3}\\S*(?![^,])
Negative lookaheads avoid to match leading or trailing delimiters. The second one in particular forces the match to be followed by the delimiter or the end of the string (not followed by a character that isn't a comma).
demo
Note that the pattern doesn't have capture groups, so the result is the whole match (or group 0).

You might use
(?:[^,:]+, )?[^:,]*(?::+[^:,]+)+
(?:[^,:]+, )? Optionally match 1+ any char except a , or : followed by , and space
[^:,]* Match 0+ any char except : or ,
(?: Non Capturing group
:+[^:,]+ Match 1+ : and 1+ times any char except : and ,
)+ Close group and repeat 1+ times
Regex demo

You seem to be making it harder than it needs to be with the lookahead (which won't be satisfied at end-of-line anyway).
([^:]*:){3}[^:,]*:?[^:,]*
Find the first 3 :'s, then start including , in the negative groupings, with an optional 4th :.

JAVA split with regex doesn't work

I have the following String 46MTS007 and i have to split numbers from letters so in result i should get an array like {"46", "MTS", "007"}
String s = "46MTS007";
String[] spl = s.split("\\d+|\\D+");
But spl remains empty, what's wrong with the regex? I've tested in regex101 and it's working like expected (with global flag)

If you want to use split you can use this lookaround based regex:
(?<=\d)(?=\D)|(?<=\D)(?=\d)
RegEx Demo
Which means split the places where next position is digit and previous is non-digit OR when position is non-digit and previous position is a digit.
In Java:
String s = "46MTS007";
String[] spl = s.split("(?<=\\d)(?=\\D)|(?<=\\D)(?=\\d)");

Regex you're using will not split the string. Split() splits the string with regex you provide but regex used here matches with whole string not the delimiter. You can use Pattern Matcher to find different groups in a string.
public static void main(String[] args) {
String line = "46MTS007";
String regex = "\\D+|\\d+";
Pattern pattern = Pattern.compile(regex);
Matcher m = pattern.matcher(line);
while(m.find())
System.out.println(m.group());
}
Output:
46
MTS
007
Note: Don't forget to user m.find() after capturing each group otherwise it'll not move to next one.

Regular expression java to extract the balance from a string

I have a String which contains " Dear user BAL= 1,234/ ".
I want to extract 1,234 from the String using the regular expression. It can be 1,23, 1,2345, 5,213 or 500
final Pattern p=Pattern.compile("((BAL)=*(\\s{1}\\w+))");
final Matcherm m = p.matcher(text);
if(m.find())
return m.group(3);
else
return "";
This returns 3.
What regular expression should I make? I am new to regular expressions.

You search in your regex for word characters \w+ but you should search for digits with \d+.
Additionally there is the comma, so you need to match that as well.
I'd use
/.BAL=\s([\d,]+(?=/)./
as pattern and get only the number in the resulting group.
Explanation:
.* match anything before
BAL= match the string "BAL="
\s match a whitespace
( start matching group
[\d,]+ matches every digit or comma one ore more times
(?=/) match the former only if followed by a slash
) end matching group
.* matches anything thereaft
This is untestet, but it should work like this:
final Pattern p=Pattern.compile(".*BAL=\\s([\\d,]+(?=/)).*");
final Matcherm m = p.matcher(text);
if(m.find())
return m.group(1);
else
return "";
According to an online tester, the pattern above matches the text:
BAL= 1,234/

If it didn't have to be extracted by the regular expression you could simply do:
// split on any whitespace into a 4-element array
String[] foo = text.split("\\s+");
return foo[3];

regex delete heading and tailing punctuation

I am trying to write a regex in Java to get rid of all heading and tailing punctuation characters except for "-" in a String, however keeping the punctuation within words intact.
I tried to replace the punctuations with "", String regex = "[\\p{Punct}+&&[^-]]"; right now, but it will delete the punctuation within word too.
I also tried to match pattern: String regex = "[(\\w+\\p{Punct}+\\w+)]"; and Matcher.maches() to match a group, but it gives me null for input String word = "#(*&wor(&d#)("
I am wondering what is the right way to deal with Regex group matching in this case
Examples:
Input: #)($&word#)($& Output: word
Input: #)($)word#google.com#)(*$&$ Output: word#google.com

Pattern p = Pattern.compile("^\\p{Punct}*(.*?)\\p{Punct}*$");
Matcher m = p.matcher("#)($)word#google.com#)(*$&$");
if (m.matches()) {
System.out.println(m.group(1));
}
To give some more info, the key is to have marks for the beginning and end of the string in the regex (^ and $) and to have the middle part match non-greedily (using *? instead of just *).

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

regex find string between 2 characters, seperated by comma - java

See this you may get idea ([\w]+),([\w]+),([\w]+) DEMO: https://rubular.com/r/9HDIwBTacxTy2O

Related

Extract sub-string from given String value using regex

How to write a regex capture group which matches a character 3 or 4 times before a delimiter?

JAVA split with regex doesn't work

Regular expression java to extract the balance from a string

regex delete heading and tailing punctuation

Categories

Resources