Java/Regex: Negate the same character repeated twice:

Java/Regex: Negate the same character repeated twice: - java

If I have the following string:
--l=Richmond-Hill, NYC --m=5-day --d=hourly
I want to match the groups:
--l=Richmond-Hill, NYC
--m=5-day
--d=hourly
I came up with the following regex:
(^--[a-zA-Z]=[^-]*)
This works when the value after the equal sign doesn't have a dash. Basically, how do I negate a double dash ?

My guess is that maybe you wish to design some expression similar to:
--[a-zA-Z]=.*?(?=--|$)
Demo
Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class re{
public static void main(String[] args){
final String regex = "--[a-zA-Z]=.*?(?=--|$)";
final String string = "--l=Richmond-Hill, NYC --m=5-day --d=hourly\n"
+ "--l=Richmond-Hill, NYC\n"
+ "--m=5-day\n"
+ "--d=hourly";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
}
}
Output
Full match: --l=Richmond-Hill, NYC
Full match: --m=5-day
Full match: --d=hourly
Full match: --l=Richmond-Hill, NYC
Full match: --m=5-day
Full match: --d=hourly
If you wish to explore/simplify/modify the expression, it's been
explained on the top right panel of
regex101.com. If you'd like, you
can also watch in this
link, how it would match
against some sample inputs.
RegEx Circuit
jex.im visualizes regular expressions:

Alternatively, you can also split your string at -- while keeping the delimiters and triming the spaces:
Pattern.compile("(?=--)")
.splitAsStream("--l=Richmond-Hill, NYC --m=5-day --d=hourly")
.map(String::trim)
.forEach(System.out::println);

Related

Looking for A Regular expression to match java regex (punct) pattern

I am looking for help/support for a Regex expression which will match studentIdMatch2 value in below class. studentIdMatch1 matches fine.However the studentIdMatch2 has studentId which can allow all the special characters other than : and ^ and comma.Hence its not working,thank you for your time and appreciate your support.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class TestRegEx {
public static void main(String args[]){
String studentIdMatch1 = "studentName:harry,^studentId:Id123";
String studentIdMatch2 = "studentName:harry,^studentId:Id-H/MPU/L&T/OA+_T/(1490)/17#)123";
Pattern pattern = Pattern
.compile("(\\p{Punct}?)(\\w+?)(:)(\\p{Punct}?)(\\w+?)(\\p{Punct}?),");
Matcher matcher = pattern.matcher(studentIdMatch1 + ","); // Works Fine(Matches Student Name and Id)
// No Special Characters in StudentId
//Matcher matcher = pattern.matcher(studentIdMatch2 + ","); //Wont work Special Characters in StudentId. Matches Student Name
while (matcher.find()) {
System.out.println("group1 = "+matcher.group(1)+ "group2 = "+matcher.group(2) +"group3 = "+matcher.group(3) +"group4 = "+matcher.group(4)+"group5 = "+matcher.group(5));
}
System.out.println("match ended");
}
}

You may try:
^SutdentName:(\w+),\^StudenId:([^\s,^:]+)$
Explanation of the above regex:
^, $ - Represents start and end of line respectively.
SutdentName: - Matches SutdentName: literally. Although according to me it should be StudentName; but I didn't changed it.
(\w+) - Represents first capturing group matching only word characters i.e. [A-Za-z0-9_] one or more times greedily.
,\^StudenId: - Matches ,^StudenId literally. Here also I guess it should be StudentId.
([^\s,^:]+) - Represents second capturing group matching everything other than white-space, ,, ^ and : one or more times greedily. You can add others according to your requirements.
You can find the demo of the above regex in here.
Sample Implementation in java:
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class Main
{
private static final Pattern pattern = Pattern.compile("^SutdentName:(\\w+),\\^StudenId:([^\\s,^:]+)$", Pattern.MULTILINE);
public static void main(String[] args) {
String string = "SutdentName:harry,^StudenId:Id123\n"
+ "SutdentName:harry,^StudenId:Id-H/MNK/U&T/BA+_T/(1490)/17#)123";
Matcher matcher = pattern.matcher(string);
while(matcher.find()){
System.out.println(matcher.group(1) + " " + matcher.group(2));
}
}
}
You can find the sample run of the above code in here.

The second (\\w+?) only captures words. So change it to capture what you want. i.e
allow all the special characters other than : and ^ and comma
like ([^:^,]+?)
^ - Negate the match
:^, - Matches : , ^ and comma

How do I create a regex for this text?

I need to create a regex that checks if the text follows this format:
The first two letters will always be 'AB' than it will be a number
between 1-9 than either A or B than a dash ('-') than a bunch of
random text followed by a colon (':') and then index position that is
A letter and 2 digit number.
So like this:
AB8B-ANYLETTERS:H12
or
AB3B-ANYTHINGCANGOHERE:A77
I have done this to check the index position but cannot figure out the text before the colon.
"^.*:[A-H]\\d\\d"
So the general format is:
AB[1-9][A or B]-[ANYCHARACTERS]:[A-Z][01-99]
I am using Java.

I'm guessing that maybe this expression might validate that:
^AB[1-9][AB]-[^:]+:[A-Z][0-9]{2}$
The expression is explained on the top right panel of regex101.com, if you wish to explore/simplify/modify it, and in this link, you can watch how it would match against some sample inputs, if you like.
Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "^AB[1-9][AB]-[^:]+:[A-Z][0-9]{2}$";
final String string = "AB8B-ANYLETTERS:H12\n"
+ "AB3B-ANYTHINGCANGOHERE:A77";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
RegEx Circuit
jex.im visualizes regular expressions:
Edit
For AC cases, we would try:
^AB[1-9][AC]-[^:]+:[A-Z][0-9]{2}$
Demo 2

How can i split address in streetname and streetnumber in nifi?

I'm trying to split a given address (Muster Straße 114 a) in streetname and streetnumber. I'm working with nifi. The situation is the following: i have a FlowFile-Attribute (order_address) which has as FlowFile-content e.g Muster Straße 114 a, and i need to split it into sepereate attributes.
I tried
/\A\s*(?:?:\s*)?(\pN+[a-zA-Z]?(?:\s*[-\/\pP]\s*\pN+[a-zA-Z]?)*)\s*,?\s*(?P(?:[a-zA-Z]\s*|\pN\pL{2,}\s\pL)\S[^,#]*?(?<!\s))s*(?:(?:[,\/]|(?=\#))\s*(?!\s*\.(?P(?!\s).*?))? | ?:(?P.*?),\s*(?=.*[,\/]))??!\s*\.)(?P[^0-9#]\s*\S(?:[^,#](?!\b\pN+\s))*?(?<!\s))\s*[\/,]?\s*(?:\sNo[.:])?\s*(?P\pN+\s*-?[a-zA-Z]?(?:\s*[-\/\pP]?\s*\pN+(?:\s*[\-a-zA-Z])?)*|[IVXLCDM]+(?!.*\b\pN+\b))(?<!\s)\s*(?:(?:[,\/]|(?=\#)|\s)\s*(?!\s*No\.)\s*(?P(?!\s).*?))?)\s*\Z/xu
but it's not working for me

If we'd like to just separate our addresses into two parts, one including the digits and one without, we could find several expressions that'd cover this rule, such as:
(.*?)([\d].*)
Demo
Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "(.*?)([\\d].*)";
final String string = "Muster Straße 114 a";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
RegEx Circuit
jex.im visualizes regular expressions:

In nifi, you can use the Nifi Expression language to manipulate FlowFile-Attributes. So i used the UpadateAttribute-Processor, to create the new FlowFile-Attributes street_name and streed_number.
I used the replaceAll method with a simple regex, to get streetnumber and streetname.
^(\D*)(?:.*)
^\D*(.*)
This two regex did it.
Here you find the screenshot of the processor:

Regex to match /src/main/{any_package}/*.java

I need a Regex for a set including only files from /src/main/java/{any_package}/*.java to apply CheckStyle rules only to these files in Eclipse.
All other files, e.g.: none *.java files, src/test/ should be ignored.

Maybe, this expression would also function here, even though your original expression is OK.
^\/src\/main\/java(\/[^\/]+)?\/\*\.java$
Demo 1
My guess is that here we wish to pass:
/src/main/java/{any_package}/*.java
/src/main/java/*.java
If the second one is undesired, then we would simply remove the optional group:
^\/src\/main\/java(\/[^\/]+)\/\*\.java$
Demo 2
and it might still work.
Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "^\\/src\\/main\\/java(\\/[^\\/]+)\\/\\*\\.java$";
final String string = "/src/main/java/{any_package}/*.java\n"
+ "/src/main/java/*.java";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
RegEx Circuit
jex.im visualizes regular expressions:

The regex I'm looking for is src[\/]main\/java\/?(?:[^\/]+\/?)*.java$
Regex 101 test demo

Java Regex to remove all full stops apart from real numbers within a text

I am trying to write a simple regex to remove all . apart from the ones which occur in real numbers.
E.g.
The value was 0.19 psi. The water level has to be brought to normal. Mtl.temp is going to be high..
The below regex selects all real numbers.
((\+|-)?([0-9]+)(\.[0-9]+)?)|((\+|-)?\.?[0-9]+)
I could do the other way wherein I could select for pattern wherein it selects . preceded by a word and succeed by space. But, the input test is not written in proper grammatical manner.

you can use the regex
\.(?!\d)
regex101 demo
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "\\.(?!\\d)";
final String string = ".12 . 0.123 Hi.there I am .invalid.";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java/Regex: Negate the same character repeated twice: - java

Alternatively, you can also split your string at -- while keeping the delimiters and triming the spaces: Pattern.compile("(?=--)") .splitAsStream("--l=Richmond-Hill, NYC --m=5-day --d=hourly") .map(String::trim) .forEach(System.out::println);

Related

Looking for A Regular expression to match java regex (punct) pattern

How do I create a regex for this text?

How can i split address in streetname and streetnumber in nifi?

Regex to match /src/main/{any_package}/*.java

Java Regex to remove all full stops apart from real numbers within a text

Categories

Resources