RegEx for matching special patterns - java

I'm trying to match a String like this:62.00|LQ+2*2,FP,MD*3 "Description"
Where the decimal value is 2 digits optional, each user is characterized by two Chars and it can be followed by
(\+[\d]+)? or (\*[\d]+)? or none, or both, or both in different order
like:
LQ*2+4 | LQ+4*2 | LQ*2 | LQ+8 | LQ
Description is also optional
What i have tried is this:
Pattern.compile("^(?<number>[\\d]+(\\.[\\d]{2})?)\\|(?<users>([A-Z]{2}){1}(((\\+[\\d]+)?(\\*[\\d]+)?)|((\\+[\\d]+)?(\\*[\\d]+)?))((,[A-Z]{2})(((\\+[\\d]+)?(\\*[\\d]+)?)|((\\+[\\d]+)?(\\*[\\d]+)?)))*)(\\s\\\"(?<message>.+)\\\")?$");
I need to get all the users so i can split them by ',' and then further regex my way into it.But i cannot grab anything out of it.The desired output from
62.00|LQ+2*2,FP,MD*3 "Description"
Should be:
62.00
LQ+2*2,FP,MD*3
Description
Accepted inputs should be of these kind:
62.00|LQ+2*2,FP,MD*3
30|LQ "Burgers"
35.15|LQ*2,FP+2*4,MD*3+4 "Potatoes"
35.15|LQ,FP,MD

The precise regex to match the inputs you described should be fulfilled by this regex,
^(\d+(?:\.\d{1,2})?)\|([a-zA-Z]{2}(?:(?:\+\d+(?:\*\d+)?)|(?:\*\d+(?:\+\d+)?))?(?:,[a-zA-Z]{2}(?:(?:\+\d+(?:\*\d+)?)|(?:\*\d+(?:\+\d+)?))?)*)(?: +(.+))?$
Where group1 will contain the number that can have optional decimals upto two digits and group2 will have the comma separated inputs as you described in your post and group3 will contain the optional description if present.
Explanation of regex:
^ - Start of string
(\d+(?:\.\d{1,2})?) - Matches the number which can have optional 2 digits after decimal and captures it in group1
\| - Matches literal | present in your input after the number
([a-zA-Z]{2}(?:(?:\+\d+(?:\*\d+)?)|(?:\*\d+(?:\+\d+)?))?(?:,[a-zA-Z]{2}(?:(?:\+\d+(?:\*\d+)?)|(?:\*\d+(?:\+\d+)?))?)*) - This part matches two letters followed by any combination of + followed by number and optionally having * followed by number OR * followed by number and optionally having + followed by number exactly either once or whole of it being optional and captures it in group2
(?: +(.+))? - This matches the optional description and captures it in group3
$ - Marks end of input
Regex Demo

I'm guessing that we have several optional groups here, that might not be a problem. The problem I'm having is that I'm not quite sure what would be the range of our inputs and what might be desired outputs.
RegEx 1
If we are just matching everything, that I'm guessing, we might like to start with something similar to:
[0-9]+(\.[0-9]{2})?\|[A-Z]{2}[+*]?([0-9]+)?[+*]?([0-9]+)?,[A-Z]{2},[A-Z]{2}[+*]?([0-9]+)?(\s+"Description")?
Here, we simply add a ? after every sub-expression that we wish to have it optional, then we use char lists and quantifiers, and start swiping everything from left to right, to cover all inputs.
If we like to capture, then we simply wrap any part that we want captured with a capturing group ().
Demo
Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "[0-9]+(\\.[0-9]{2})?\\|[A-Z]{2}[+*]?([0-9]+)?[+*]?([0-9]+)?,[A-Z]{2},[A-Z]{2}[+*]?([0-9]+)?(\\s+\"Description\")?";
final String string = "62.00|LQ+2*2,FP,MD*3 \"Description\"\n"
+ "62|LQ+2*2,FP,MD*3 \"Description\"\n"
+ "62|LQ+2*2,FP,MD*3\n"
+ "62|LQ*2,FP,MD*3\n"
+ "62|LQ+8,FP,MD*3\n"
+ "62|LQ,FP,MD";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
RegEx 2
If we wish to output three groups that is listed:
([0-9]+(\.[0-9]{2})?)\|([A-Z]{2}[+*]?([0-9]+)?[+*]?([0-9]+)?,[A-Z]{2},[A-Z]{2}[+*]?([0-9]+)?)(\s+"Description")?
Demo 2
Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "([0-9]+(\\.[0-9]{2})?)\\|([A-Z]{2}[+*]?([0-9]+)?[+*]?([0-9]+)?,[A-Z]{2},[A-Z]{2}[+*]?([0-9]+)?)(\\s+\"Description\")?";
final String string = "62.00|LQ+2*2,FP,MD*3 \"Description\"\n"
+ "62|LQ+2*2,FP,MD*3 \"Description\"\n"
+ "62|LQ+2*2,FP,MD*3\n"
+ "62|LQ*2,FP,MD*3\n"
+ "62|LQ+8,FP,MD*3\n"
+ "62|LQ,FP,MD";
final String subst = "\\1\\n\\3\\n\\7";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
// The substituted value will be contained in the result variable
final String result = matcher.replaceAll(subst);
System.out.println("Substitution result: " + result);
RegEx 3
Based on updated desired output, this might work:
([0-9]+(\.[0-9]{2})?)\|((?:[A-Z]{2}[+*]?([0-9]+)?[+*]?([0-9]+)?,?)(?:[A-Z]{2}[+*]?([0-9]+)?[*+]?([0-9]+)?,?[A-Z]{2}?[*+]?([0-9]+)?[+*]?([0-9]+)?)?)(\s+"(.+?)")?
DEMO

Related

regex to capture the string between a word and first occurrence of a character

Want to capture the string after the last slash and before the first occurrence of backward slash().
sample data:
sessionId=30a793b1-ed7e-464a-a630; Url=https://www.example.com/mybook/order/newbooking/itemSummary; sid=KJ4dgQGdhg7dDn1h0TLsqhsdfhsfhjhsdjfhjshdjfhjsfddscg139bjXZQdkbHpzf9l6wy1GdK5XZp; ,"myreferer":"https://www.example.com/mybook/order/newbooking/itemSummary/amex","Accept":"application/json, application/javascript","sessionId":"ggh76734",
targetUrl=https://www.example.com/mybook/order/newbooking/page1?id=122;
sessionId=sfdsdfsd-ba57-4e21-a39f-34; Url=https://www.example.com/mybook/order/newbooking/itemList?id=76734&para=jhjdfhj&type=new&ordertype=kjkf&memberid=273647632&iSearch=true; sid=Q4hWgR1GpQb8xWTLpQB2yyyzmYRgXgFlJLGTc0QJyZbW; ,"myreferer":"https://www.example.com/mybook/order/newbooking/itemList/basket","Accept":"application/json, application/javascript","sessionId":"ggh76734", targetUrl=https://www.example.com/ mybook/order/newbooking/page1?id=123;
sessionId=0e1acab1-45b8-sdf3454fds-afc1-sdf435sdfds; Url=https://www.example.com/mybook/order/newbooking/; sid=hkm2gRSL2t5ScKSJKSJn3vg2sfdsfdsfdsfdsfdfdsfdsfdsfvJZkDD3ng0kYTjhNQw8mFZMn; ,"myreferer":"https://www.example.com/mybook/order/newbooking/itemList/","Accept":"application/json, application/javascript","sessionId":"ggh76734",targetUrl=https://www.example.com/mybook/order/newbooking/page1?id=343;List item
sessionId=sfdsdfsd-ba57-4e21-a39f-34; Url=https://www.example.com/mybook/order/newbooking/itemList?id=76734&para=jhjdfhj&type=new&ordertype=kjkf&memberid=273647632&iSearch=true; sid=Q4hWgR1GpQb8xWTLpQB2yyyzmYRgXgFlJLGTc0QJyZbW; ,"myreferer":"https://www.example.com/mybook/order/newbooking/itemList/basket?id=76734&para=jhjdfhj&type=new&ordertype=kjkf", "Accept":"application/json, application/javascript","sessionId":"ggh76734", targetUrl=https://www.example.com/ mybook/order/newbooking/page1?id=123;
Expecting the below output:
amex
basket
''(empty string)
basket
Have build the below regex to capture it but its 100% accurate. It is capturing some additional part.
Regex
\bmyreferer\\\":\\\"\S+\/(.*?)\\\",
Could you please help me to improve the regex to get desired output?
You could use a negated character class with a capture group:
\bmyreferer":"[^"]+/([^/"]*)"
\bmyreferer":" Match literally preceded by a word boundary
[^"]+/ Match 1+ times any char except ", followed by a /
( Capture group 1
[^/"]* Optionally match (to also match an empty string) any char except / and "
)" Close group 1 and match "
regex demo | Java demo
Example code
String regex = "\\bmyreferer\":\"[^\"]+/([^/\"]*)\"";
String string = "sessionId=30a793b1-ed7e-464a-a630; Url=https://www.example.com/mybook/order/newbooking/itemSummary; sid=KJ4dgQGdhg7dDn1h0TLsqhsdfhsfhjhsdjfhjshdjfhjsfddscg139bjXZQdkbHpzf9l6wy1GdK5XZp; ,\"myreferer\":\"https://www.example.com/mybook/order/newbooking/itemSummary/amex\",\"Accept\":\"application/json, application/javascript\",\"sessionId\":\"ggh76734\", targetUrl=https://www.example.com/mybook/order/newbooking/page1?id=122;\n\n"
+ "sessionId=sfdsdfsd-ba57-4e21-a39f-34; Url=https://www.example.com/mybook/order/newbooking/itemList?id=76734&para=jhjdfhj&type=new&ordertype=kjkf&memberid=273647632&iSearch=true; sid=Q4hWgR1GpQb8xWTLpQB2yyyzmYRgXgFlJLGTc0QJyZbW; ,\"myreferer\":\"https://www.example.com/mybook/order/newbooking/itemList/basket\",\"Accept\":\"application/json, application/javascript\",\"sessionId\":\"ggh76734\", targetUrl=https://www.example.com/ mybook/order/newbooking/page1?id=123;\n\n"
+ "sessionId=0e1acab1-45b8-sdf3454fds-afc1-sdf435sdfds; Url=https://www.example.com/mybook/order/newbooking/; sid=hkm2gRSL2t5ScKSJKSJn3vg2sfdsfdsfdsfdsfdfdsfdsfdsfvJZkDD3ng0kYTjhNQw8mFZMn; ,\"myreferer\":\"https://www.example.com/mybook/order/newbooking/itemList/\",\"Accept\":\"application/json, application/javascript\",\"sessionId\":\"ggh76734\",targetUrl=https://www.example.com/mybook/order/newbooking/page1?id=343;List item";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Group 1 value: " + matcher.group(1));
}
Output
Group 1 value: amex
Group 1 value: basket
Group 1 value:

How to match forward slashes or periods at end of String but Not Capture Using Java Regular Expression

I am having problems understand how regular expression can match text but not include the matched text that is found. Perhaps I need to be working with groups which I'm not doing because I usually see the term non-capturing groups being used.
The goal is say I have ticket in a log file as follows:
TICKET/A/ADMIN/05MAR2020// to return only A/ADMIN/05MAR2020
or if
TICKET/A/ENGINEERING/05MAR2020. to return only A/ENGINEERING/05MAR02020
where the "//" or "." has been removed
Lastly to ignore lines like:
TICKET HAS BEEN COMPLETED
using regex = "(?<=^TICKET\\s{0,2}/).*(?://|\\.)?
So telling parser look for TICKET at start of string followed by a forward slash, but don't return TICKET. And look for either a double forward slash "//" or "." a period at the end of string but make this optional.
My Java 1.8.x code follows:
// used in the import statement: import java.util.regex.Matcher;
// import java.util.regex.Pattern;
private static void testRegex() {
String ticket1 = "TICKET/A/ITSUPPORT/05MAR2020//";
String ticket2 = "TICKET /B/ADMIN/06MAR2020.";
String ticket3 = "TICKET/C/GENERAL/07MAR2020";
//https://www.regular-expressions.info/brackets.html
String regex = "(?<=^TICKET\\s{0,2}/).*(?://|\\.)?";
Pattern pat = Pattern.compile(regex);
Matcher mat = pat.matcher(ticket1);
if (mat.find()) {
String myticket = ticket1.substring(mat.start(), mat.end());
System.out.println(myticket+ ", Expect 'A/ITSUPPORT/05MAR2020'");
}
mat = pat.matcher(ticket2);
if (mat.find()) {
String myticket = ticket2.substring(mat.start(), mat.end());
System.out.println(myticket+", Expect 'B/ADMIN/06MAR2020'");
}
mat = pat.matcher(ticket3);
if (mat.find()) {
String myticket = ticket3.substring(mat.start(), mat.end());
System.out.println(myticket+", Expect 'C/GENERAL/07MAR2020'");
}
regex = "(//|\\.)";
pat = Pattern.compile(regex);
mat = pat.matcher(ticket1);
if (mat.find()) {
String myticket = ticket1.substring(mat.start(), mat.end());
System.out.println(myticket+", "+mat.start() + ", " + mat.end() + ", " + mat.groupCount());
}
}
My actual results follow:
A/ITSUPPORT/05MAR2020//, Expect 'A/ITSUPPORT/05MAR2020
B/ADMIN/06MAR2020., Expect 'B/ADMIN/06MAR2020
C/GENERAL/07MAR2020, Expect 'C/GENERAL/07MAR2020
//, 28, 30, 1
Any suggestion would be appreciate. Please note, been learning from StackOverflow long-time but first entry, hope question is asked appropriately. Thank you.
You could use a positive lookahead at the end of the pattern instead of a match.
The lookahead asserts what is at the end of the string is an optional // or .
As the dot and the double forward slash are optional, you have to make the .*? non greedy.
(?<=^TICKET\s{0,2}/).*?(?=(?://|\.)?$)
In parts
(?<= Positive lookbehind, assert what is on the left is
^ Start of the string
TICKET\s{0,2}/ Match TICKET and 0-2 whitespace chars followed by /
) Close lookbehind
.*? Match any char except a newline 0+ times, as least as possible (non greedy)
(?= Positive lookahead, assert what is on the the right is
(?: Non capture group for the alternation | because both can be followed by $
// Match 2 forward slashes
| Or
\. Match a dot
)? Close the non capture group and make it optional
$ Assert the end of the string
) Close the positive lookahead
In Java
String regex = "(?<=^TICKET\\s{0,2}/).*?(?=(?://|\\.)?$)";
Regex demo 1 | Java demo
1. The regex demo has Javascript selected for the demo only
Output of the updated pattern with your code:
A/ITSUPPORT/05MAR2020, Expect 'A/ITSUPPORT/05MAR2020'
B/ADMIN/06MAR2020, Expect 'B/ADMIN/06MAR2020'
C/GENERAL/07MAR2020, Expect 'C/GENERAL/07MAR2020'
//, 28, 30, 1

How do I create a regex for this text?

I need to create a regex that checks if the text follows this format:
The first two letters will always be 'AB' than it will be a number
between 1-9 than either A or B than a dash ('-') than a bunch of
random text followed by a colon (':') and then index position that is
A letter and 2 digit number.
So like this:
AB8B-ANYLETTERS:H12
or
AB3B-ANYTHINGCANGOHERE:A77
I have done this to check the index position but cannot figure out the text before the colon.
"^.*:[A-H]\\d\\d"
So the general format is:
AB[1-9][A or B]-[ANYCHARACTERS]:[A-Z][01-99]
I am using Java.
I'm guessing that maybe this expression might validate that:
^AB[1-9][AB]-[^:]+:[A-Z][0-9]{2}$
The expression is explained on the top right panel of regex101.com, if you wish to explore/simplify/modify it, and in this link, you can watch how it would match against some sample inputs, if you like.
Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "^AB[1-9][AB]-[^:]+:[A-Z][0-9]{2}$";
final String string = "AB8B-ANYLETTERS:H12\n"
+ "AB3B-ANYTHINGCANGOHERE:A77";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
RegEx Circuit
jex.im visualizes regular expressions:
Edit
For AC cases, we would try:
^AB[1-9][AC]-[^:]+:[A-Z][0-9]{2}$
Demo 2

Regex to retrieve any alphanumeric otp code

I am trying to get a regex to retrieve alphanumeric OTP code (length of the code maybe dynamic i.e depending on user's choice) and must contain at least one digit.
I have tried the following regex :
"[a-zA-z][0-9].[a-zA-z]"
But if a special character is there in the code it should result null instead it retrieves the characters before and after the special character which is not desired.
Some sample OTP-containing messages on which the regex is desired to work successfully:
OTP is **** for txn of INR 78.90.
**** is your one-time password.
Hi, Your OTP is ****.
Examples of Alphanumeric OTPs with at least one-digit:
78784
aZ837
987Ny
19hd35
fc82pl
It would be a bit difficult, maybe this expression might work with an i flag:
[a-z0-9]*\d[a-z0-9]*
or with word boundaries:
(?<=\b)[a-z0-9]*\d[a-z0-9]*(?=\b)
Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "[a-z0-9]*\\d[a-z0-9]*";
final String string = "78784\n"
+ "aZ837\n"
+ "987Ny\n"
+ "19hd35\n"
+ "fc82pl";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE | Pattern.CASE_INSENSITIVE);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
The expression is explained on the top right panel of regex101.com, if you wish to explore/simplify/modify it, and in this link, you can watch how it would match against some sample inputs, if you like.
Try this one
[0-9a-zA-Z]*[0-9]+[0-9a-zA-Z]*
This evaluates your desired result.
I have tested this in this site
Your regex is almost correct.
It should be \b[a-zA-z]*[0-9]+[a-zA-z0-9]*\b.

RegEx for capturing special chars

I am trying to replace a string using regular expression what i need basically is to convert a code like assignment:
k*=i
into
k=k+i
In my example:
jregex.Pattern p=new jregex.Pattern("([a-z]|[A-Z])([a-z]|[A-Z]|\\d)*[\\+|\\*|\\-|\\/][=]([a-z]|[A-Z])*([a-z]|[A-Z]|\\d)");
Replacer r= new Replacer(p,"1=$1,2=$2,3=$3,4=$4,5=$5,6=$6,7=$7,8=$8");
String result=r.replace("k*=i");
The regex seems to not extract the special chars.
(in this example: +, -, *, /, =)
So what I get as result is:
1=k,2=,3=,4=i,5=,6=,7=,8=
(I can extract only the k & i)
How do I solve this problem?
Here, we can design as expression similar to:
(.+)[*+-/]=(.+)
where we are capturing our k and i using these two capturing groups in the start and end:
(.+)
We can add more boundaries, if we wish, such as start and end char:
^(.+)[*+-/]=(.+)$
Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "(.+)[*+-/]=(.+)";
final String string = "k*=i\n"
+ "apple*=orange";
final String subst = "$1=$1+$2";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
// The substituted value will be contained in the result variable
final String result = matcher.replaceAll(subst);
System.out.println("Substitution result: " + result);
DEMO
RegEx Circuit
jex.im visualizes regular expressions:
You could use 3 capturing groups and capturing *+/- in a character class.
([a-zA-Z])([*+/-])=([a-zA-Z])
That will match:
([a-zA-Z]) Capture group 1, match a-z A-Z
([*+/-]) Capture group 2, match * + / -
= Match literally
([a-zA-Z]) Capture group 3, match a-z A-Z
Regex demo | Java demo
And replace with:
$1=$1$2$3

Categories