RegEx for capturing special chars - java

I am trying to replace a string using regular expression what i need basically is to convert a code like assignment:
k*=i
into
k=k+i
In my example:
jregex.Pattern p=new jregex.Pattern("([a-z]|[A-Z])([a-z]|[A-Z]|\\d)*[\\+|\\*|\\-|\\/][=]([a-z]|[A-Z])*([a-z]|[A-Z]|\\d)");
Replacer r= new Replacer(p,"1=$1,2=$2,3=$3,4=$4,5=$5,6=$6,7=$7,8=$8");
String result=r.replace("k*=i");
The regex seems to not extract the special chars.
(in this example: +, -, *, /, =)
So what I get as result is:
1=k,2=,3=,4=i,5=,6=,7=,8=
(I can extract only the k & i)
How do I solve this problem?

Here, we can design as expression similar to:
(.+)[*+-/]=(.+)
where we are capturing our k and i using these two capturing groups in the start and end:
(.+)
We can add more boundaries, if we wish, such as start and end char:
^(.+)[*+-/]=(.+)$
Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "(.+)[*+-/]=(.+)";
final String string = "k*=i\n"
+ "apple*=orange";
final String subst = "$1=$1+$2";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
// The substituted value will be contained in the result variable
final String result = matcher.replaceAll(subst);
System.out.println("Substitution result: " + result);
DEMO
RegEx Circuit
jex.im visualizes regular expressions:

You could use 3 capturing groups and capturing *+/- in a character class.
([a-zA-Z])([*+/-])=([a-zA-Z])
That will match:
([a-zA-Z]) Capture group 1, match a-z A-Z
([*+/-]) Capture group 2, match * + / -
= Match literally
([a-zA-Z]) Capture group 3, match a-z A-Z
Regex demo | Java demo
And replace with:
$1=$1$2$3

Related

How to divide a string with a regex through characters that are outside square brackets?

I have this string:
PARTNER6;PARTNER7[PORTAL4;PORTAL5];PARTNER1[PARTNER1WEB] -> ∞
I want to divide it like this:
PARTNER6
PARTNER7[PORTAL4;PORTAL5]
PARTNER1[PARTNER1WEB]
I tried to use this expression, but it divides everything including what is in parentheses
[\s,;]+
I can't figure out how to divide only what is outside the brackets
You may use this regex to get all of your matches:
\w+(?:\[[^]]*\]\w*)*\w*
RegEx Demo
RegEx Details:
\w+: Match 1+ word characters
(?:\[[^]]*\]\w*)*: Match [...] string followed by 0 or more word characters. Repeat this group 0 or more times
\w*: Match 0 or more word characters
Code:
jshell> String regex = "\\w+(?:\\[[^]]*\\]\\w*)*\\w*";
regex ==> "\\w+(?:\\[[^]]*\\]\\w*)*\\w*"
jshell> String string = "PARTNER6;PARTNER7[PORTAL4;PORTAL5];PARTNER1[PARTNER1WEB] -> ∞";
string ==> "PARTNER6;PARTNER7[PORTAL4;PORTAL5];PARTNER1[PARTNER1WEB] -> ∞"
jshell> Pattern.compile(regex).matcher(string).results().map(MatchResult::group).collect(Collectors.toList());
$3 ==> [PARTNER6, PARTNER7[PORTAL4;PORTAL5], PARTNER1[PARTNER1WEB]]
I would use a regex find all approach here:
String input = "PARTNER6;PARTNER7[PORTAL4;PORTAL5];PARTNER1[PARTNER1WEB]";
String pattern = "(\\w+(?:\\[.*?\\])?);?";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(input);
while (m.find()) {
System.out.println(m.group(1));
}
This prints:
PARTNER6
PARTNER7[PORTAL4;PORTAL5]
PARTNER1[PARTNER1WEB]
The regex pattern used above matches a word with \w+, followed by an optional term in square brackets, followed by optional semicolon.

How to match forward slashes or periods at end of String but Not Capture Using Java Regular Expression

I am having problems understand how regular expression can match text but not include the matched text that is found. Perhaps I need to be working with groups which I'm not doing because I usually see the term non-capturing groups being used.
The goal is say I have ticket in a log file as follows:
TICKET/A/ADMIN/05MAR2020// to return only A/ADMIN/05MAR2020
or if
TICKET/A/ENGINEERING/05MAR2020. to return only A/ENGINEERING/05MAR02020
where the "//" or "." has been removed
Lastly to ignore lines like:
TICKET HAS BEEN COMPLETED
using regex = "(?<=^TICKET\\s{0,2}/).*(?://|\\.)?
So telling parser look for TICKET at start of string followed by a forward slash, but don't return TICKET. And look for either a double forward slash "//" or "." a period at the end of string but make this optional.
My Java 1.8.x code follows:
// used in the import statement: import java.util.regex.Matcher;
// import java.util.regex.Pattern;
private static void testRegex() {
String ticket1 = "TICKET/A/ITSUPPORT/05MAR2020//";
String ticket2 = "TICKET /B/ADMIN/06MAR2020.";
String ticket3 = "TICKET/C/GENERAL/07MAR2020";
//https://www.regular-expressions.info/brackets.html
String regex = "(?<=^TICKET\\s{0,2}/).*(?://|\\.)?";
Pattern pat = Pattern.compile(regex);
Matcher mat = pat.matcher(ticket1);
if (mat.find()) {
String myticket = ticket1.substring(mat.start(), mat.end());
System.out.println(myticket+ ", Expect 'A/ITSUPPORT/05MAR2020'");
}
mat = pat.matcher(ticket2);
if (mat.find()) {
String myticket = ticket2.substring(mat.start(), mat.end());
System.out.println(myticket+", Expect 'B/ADMIN/06MAR2020'");
}
mat = pat.matcher(ticket3);
if (mat.find()) {
String myticket = ticket3.substring(mat.start(), mat.end());
System.out.println(myticket+", Expect 'C/GENERAL/07MAR2020'");
}
regex = "(//|\\.)";
pat = Pattern.compile(regex);
mat = pat.matcher(ticket1);
if (mat.find()) {
String myticket = ticket1.substring(mat.start(), mat.end());
System.out.println(myticket+", "+mat.start() + ", " + mat.end() + ", " + mat.groupCount());
}
}
My actual results follow:
A/ITSUPPORT/05MAR2020//, Expect 'A/ITSUPPORT/05MAR2020
B/ADMIN/06MAR2020., Expect 'B/ADMIN/06MAR2020
C/GENERAL/07MAR2020, Expect 'C/GENERAL/07MAR2020
//, 28, 30, 1
Any suggestion would be appreciate. Please note, been learning from StackOverflow long-time but first entry, hope question is asked appropriately. Thank you.
You could use a positive lookahead at the end of the pattern instead of a match.
The lookahead asserts what is at the end of the string is an optional // or .
As the dot and the double forward slash are optional, you have to make the .*? non greedy.
(?<=^TICKET\s{0,2}/).*?(?=(?://|\.)?$)
In parts
(?<= Positive lookbehind, assert what is on the left is
^ Start of the string
TICKET\s{0,2}/ Match TICKET and 0-2 whitespace chars followed by /
) Close lookbehind
.*? Match any char except a newline 0+ times, as least as possible (non greedy)
(?= Positive lookahead, assert what is on the the right is
(?: Non capture group for the alternation | because both can be followed by $
// Match 2 forward slashes
| Or
\. Match a dot
)? Close the non capture group and make it optional
$ Assert the end of the string
) Close the positive lookahead
In Java
String regex = "(?<=^TICKET\\s{0,2}/).*?(?=(?://|\\.)?$)";
Regex demo 1 | Java demo
1. The regex demo has Javascript selected for the demo only
Output of the updated pattern with your code:
A/ITSUPPORT/05MAR2020, Expect 'A/ITSUPPORT/05MAR2020'
B/ADMIN/06MAR2020, Expect 'B/ADMIN/06MAR2020'
C/GENERAL/07MAR2020, Expect 'C/GENERAL/07MAR2020'
//, 28, 30, 1

regex find string between 2 characters, seperated by comma

I am new to regular expression and i want to find a string between two characters,
I tried below but it always returns false. May i know whats wrong with this ?
public static void main(String[] args) {
String input = "myFunction(hello ,world, test)";
String patternString = "\\(([^]]+)\\)";
Pattern pattern = Pattern.compile(patternString);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
System.out.println(matcher.group());
}
}
Input:
myFunction(hello,world,test) where myFunction can be any characters. before starting ( there can be any characters.
Output:
hello
world
test
You could match make use of the \G anchor which asserts the position at the end of the previous match and and capture your values in a group:
(?:\bmyFunction\(|\G(?!^))([^,]+)(?:\h*,\h*)?(?=[^)]*\))
In Java:
String regex = "(?:\\bmyFunction\\(|\\G(?!^))([^,]+)(?:\\h*,\\h*)?(?=[^)]*\\))";
Explanation
(?: Non capturing group
\bmyFunction\( Word boundary to prevent the match being part of a larger word, match myFunction and an opening parentheses (
| Or
\G(?!^) Assert position at the end of previous match, not at the start of the string
) Close non capturing group
([^,]+) Capture in a group matching 1+ times not a comma
(?:\h*,\h*)? Optionally match a comma surrounded by 0+ horizontal whitespace chars
(?=[^)]*\)) Positive lookahead, assert what is on the right is a closing parenthesis )
Regex demo | Java demo
For example:
String patternString = "(?:\\bmyFunction\\(|\\G(?!^))([^,]+)(?:\\h*,\\h*)?(?=[^)]*\\))";
String input = "myFunction(hello ,world, test)";
Pattern pattern = Pattern.compile(patternString);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
Result
hello
world
test
I'd suggest you to achieve this in a two-step process:
Step 1: Capture all the content between ( and )
Use the regex: ^\S+\((.*)\)$
Demo
The first and the only capturing group will contain the required text.
Step 2: Split the captured string above on ,, thus yielding all the comma-separated parameters independently.
See this you may get idea
([\w]+),([\w]+),([\w]+)
DEMO: https://rubular.com/r/9HDIwBTacxTy2O

Regular expression to split between pipes except in brackets

I have the following text line:
|random|[abc|www.abc.org]|1024|
I would like to split these into 3 parts with a regular expression
random
[abc|www.abc.org]
1024
Currently the following result is achieved with expression \|
random
[abc
www.abc.org]
1024
My problem is that I cannot exclude the pipe symbol in the middle column surrounded by the brackets [].
If you have to use split, you can use the regex
\|(?=$|[^]]+\||\[[^]]+\]\|)
https://regex101.com/r/7OxmiY/1
It will match a pipe, then lookahead for either:
$, the end of the string, so that the final | is split on, or
[^]]+\|, non-] characters until a pipe is reached, ensuring that pipes inside []s will not be split upon, or
\[[^]]+\]\| - Same as above, except with literal [ and ]s surrounding the pattern
In Java:
String input = "|random|[abc|www.abc.org]|[test]|1024|";
String[] output = input.split("\\|(?=$|[^]]+\\|)");
You can use follow code:
final String regex = "(?<=|)\\[?[\\w.]+\\|?[\\w.]+\\]?(?=|)";
final String string = "|random|[abc|www.abc.org]|[test]|1024|";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
}
Output:
Full match: random
Full match: [abc|www.abc.org]
Full match: [test]
Full match: 1024
See here at regex101: https://regex101.com/r/Fcb3Wx/1

JAVA regex for "String, String."

Given string "Neil, Gogte., Satyam, B.: Introduction to Java"
I need to extract only "Neil, Gogte." and "Satyam, B." from given string using regex how can I do it?
You can use matcher to group
String str = "Neil, Gogte., Satyam, B.: Introduction to Java";
Pattern pattern = Pattern.compile("([a-zA-Z]+, [a-zA-Z]+\\.)");
Matcher matcher = pattern.matcher(str);
while(matcher.find()){
String result = matcher.group();
System.out.println(result);
}
You can use the following regex to split the string. This matches any locations where ., exist:
(?<=\.),\s*
(?<=\.) Positive lookbehind ensuring what precedes is a literal dot character .
,\s* Matches , followed by any number of whitespace characters
See code in use here
import java.util.*;
import java.util.regex.Pattern;
class Main {
public static void main(String[] args) {
final String s = "Neil, Gogte., Satyam, B.: Introduction to Java";
final Pattern r = Pattern.compile("(?<=\\.),\\s*");
String[] result = r.split(s);
Arrays.stream(result).forEach(System.out::println);
}
}
Result:
Neil, Gogte.
Satyam, B.: Introduction to Java
You might use this regex to match your names:
[A-Z][a-z]+, [A-Z][a-z]*\.
In Java:
[A-Z][a-z]+, [A-Z][a-z]*\\.
That would match
[A-Z] Match an uppercase character
[a-z]+ Match one or more lowercase characters
, Match comma and a whitespace
[A-Z] Match an uppercase character
[a-z]* Match zero or more lowercase characters
\. Match a dot
Demo Java

Categories