I need regex that will fail only for below patterns and pass for everything else.
RXXXXXXXXXX (X are digits)
XXX.XXX.XXX.XXX (IP address)
I have basic knowledge of regex but not sure how to achieve this one.
For the first part, I know how to use regex to not start with R but how to make sure it allows any number of digits except 10 is not sure.
^[^R][0-9]{10}$ - it will do the !R thing but not sure how to pull off the not 10 digits part.
Well, simply define a regex:
Pattern p = Pattern.compile("R[0-9]{10} ((0|1|)[0-9]{1,2}|2([0-4][0-9]|5[0-5]))(\\.((0|1|)[0-9]{1,2}|2([0-4][0-9]|5[0-5]))){3}");
Matcher m = p.matcher(theStringToMatch);
if(!m.matches()) {
//do something, the test didn't pass thus ok
}
Or a jdoodle.
EDIT:
Since you actually wanted two possible patterns to filter out, chance the pattern to:
Pattern p = Pattern.compile("(R[0-9]{10})|(((0|1|)[0-9]{1,2}|2([0-4][0-9]|5[0-5]))(\\.((0|1|)[0-9]{1,2}|2([0-4][0-9]|5[0-5]))){3})");
If you want to match the entire string (so that the string should start and end with the pattern, place ^ in from and $ at the end of the pattern.
This should work:
!(string.matches("R\d{10}|(\d{3}\\.){3}\d{3}");
The \d means any digit, the brackets mean how many times it is repeated, and the \. means the period character. Parentheses indicate a grouping.
Here's a good reference on java regex with examples.
http://www.vogella.com/tutorials/JavaRegularExpressions/article.html
Regex is not meant to validate every kind of input. You could, but sometimes it is not the right approach (similar to use a wrench as a hammer: it could do it but is not meant for it).
Split the string in two parts, by the space, then validate each:
String foo = "R1234567890 255.255.255.255";
String[] stringParts = foo.split(" ");
Pattern p = Pattern.compile("^[^R][0-9]{10}$");
Matcher m = p.macher(stringParts[0]);
if (m.matches()) {
//the first part is valid
//start validating the IP
String[] ipParts = stringParts.split("\\.");
for (String ip : ipParts) {
int ipPartValue = Integer.parseInt(ip);
if (!(ipPartValue >= 0 && ipPartValue <= 255)) {
//error...
}
}
}
Related
As title says, I've a string and I want to extract some data from It.
This is my String:
text = "|tab_PRO|1|1|#tRecordType#||0|tab_PRO|";
and I want to extract all the data between the pipes: tab_PRO, 1, 1...and so on
.
I've tried:
Pattern p = Pattern.compile("\\|(.*?)\\|");
Matcher m = p.matcher(text);
while(m.find())
{
for(int i = 1; i< 10; i++) {
test = m.group(i);
System.out.println(test);
}
}
and with this i get the first group that's tab_PRO. But i also get an error
java.lang.IndexOutOfBoundsException: No group 2
Now, probably I didn't understand quite well how the groups works, but I thought that with this I could get the remaining data that I need. I'm not able to understand what I'm missing.
Thanks in advance
Use String.split(). Take into account it expects a regex as an argument, and | is a reserved regex operand, so you'll need to escape it with a \. So, make it two \ so \| won't be interpreted as if you're using an - invalid - escape sequence for the | character:
String[] parts = text.split("\\|");
See it working here:
https://ideone.com/WibjUm
If you want to go with your regex approach, you'll need to group and capture every repetition of characters after every | and restrict them to be anything except |, possibly using a regex like \\|([^\\|]*).
In your loop, you iterate over m.find() and just use capture group 1 because its the only group every match will have.
String text = "|tab_PRO|1|1|#tRecordType#||0|tab_PRO|";
Pattern p = Pattern.compile("\\|([^\\|]*)");
Matcher m = p.matcher(text);
while(m.find()){
System.out.println(m.group(1));
}
https://ideone.com/RNjZRQ
Try using .split() or .substring()
As mentioned in the comments, this is easier done with String.split.
As for your own code, you are unnecessarily using the inner loop, and that's leading to that exception. You only have one group, but the for loop will cause you to query more than one group. Your loop should be as simple as:
Pattern p = Pattern.compile("(?<=\\|)(.*?)\\|");
Matcher m = p.matcher(text);
while (m.find()) {
String test = m.group(1);
System.out.println(test);
}
And that prints
tab_PRO
1
1
#tRecordType#
0
tab_PRO
Note that I had to use a look-behind assertion in your regex.
I have written a snippet, but it doesn't work correctly.
I have an input in this format:
Arg2+res=(s11_19,s11_20,s11_21,s11_22),Arg4-res=()
It can contain multiple Args (e.g. Arg1, Arg2, ...).
What I want, is to return +resinstances. For example, in the above example, I need this part:
Arg2+res=(s11_19,s11_20,s11_21,s11_22)
My Regex is like the following:
Pattern p = Pattern.compile("Arg\\d+\\+res=\\(\\S+\\)");
Matcher m = p.matcher(ove_imp_roles);
while (m.find()) {
System.out.println(m.group());
}
The code has two problems:
1) It returns the whole string as a single match. For example, in the above sentence it returns Arg2+res=(s11_19,s11_20,s11_21,s11_22),Arg4-res=() as the matching instance.
Even if both instances include Arg1+res, it returns the whole string as a single match, while I expect it to be returned as two different matches.
2) The code counts instances with -res, too, while I don't need them.
Can anyone help me with this problem?
Update: I checked the code again and updated the above question correspondingly. The problem with -res occurs when it includes empty brackets (for example Arg1-res=().
Thanks in advance,
You're calling m.find() inside while(m.find()), make it like this:
Pattern p = Pattern.compile("Arg\\d+\\+res=\\(\\S+\\)");
Matcher m = p.matcher(ove_imp_roles);
while (m.find()) {
System.out.println(m.group());
}
btw your regex is matching 2nd Arg correctly
Based on the edited question and new input OP can use this regex:
Pattern p = Pattern.compile("Arg\\d+\\+res=\\([^)]+\\)");
[^)]+ will match 1 or more characters that are not ).
The problem is (\\S+\\). If you have the following input:
String s = "Arg2+res=(s1355_19,s1355_20);Arg3-res=(s1355_19,s1355_20)";
Arg\\d+\\+res=\\( matches Arg2+res=( and then S+ will match (because the + is greedy):
s1355_19,s1355_20);Arg3-res=(s1355_19,s1355_20
So you can make it lazy, so that it stops as soon as it finds the first right parenthesis in the input:
Pattern p = Pattern.compile("Arg\\d+\\+res=\\(\\S+?\\)");
Alternatively, you can split the input by ';' and see if each String matches "^Arg\\d+\\+.*$"
I have query about java regular expressions. Actually, I am new to regular expressions.
So I need help to form a regex for the statement below:
Statement: a-alphanumeric&b-digits&c-digits
Possible matching Examples: 1) a-90485jlkerj&b-34534534&c-643546
2) A-RT7456ffgt&B-86763454&C-684241
Use case: First of all I have to validate input string against the regular expression. If the input string matches then I have to extract a value, b value and c value like
90485jlkerj, 34534534 and 643546 respectively.
Could someone please share how I can achieve this in the best possible way?
I really appreciate your help on this.
you can use this pattern :
^(?i)a-([0-9a-z]++)&b-([0-9]++)&c-([0-9]++)$
In the case what you try to match is not the whole string, just remove the anchors:
(?i)a-([0-9a-z]++)&b-([0-9]++)&c-([0-9]++)
explanations:
(?i) make the pattern case-insensitive
[0-9]++ digit one or more times (possessive)
[0-9a-z]++ the same with letters
^ anchor for the string start
$ anchor for the string end
Parenthesis in the two patterns are capture groups (to catch what you want)
Given a string with the format a-XXX&b-XXX&c-XXX, you can extract all XXX parts in one simple line:
String[] parts = str.replaceAll("[abc]-", "").split("&");
parts will be an array with 3 elements, being the target strings you want.
The simplest regex that matches your string is:
^(?i)a-([\\da-z]+)&b-(\\d+)&c-(\\d+)
With your target strings in groups 1, 2 and 3, but you need lot of code around that to get you the strings, which as shown above is not necessary.
Following code will help you:
String[] texts = new String[]{"a-90485jlkerj&b-34534534&c-643546", "A-RT7456ffgt&B-86763454&C-684241"};
Pattern full = Pattern.compile("^(?i)a-([\\da-z]+)&b-(\\d+)&c-(\\d+)");
Pattern patternA = Pattern.compile("(?i)([\\da-z]+)&[bc]");
Pattern patternB = Pattern.compile("(\\d+)");
for (String text : texts) {
if (full.matcher(text).matches()) {
for (String part : text.split("-")) {
Matcher m = patternA.matcher(part);
if (m.matches()) {
System.out.println(part.substring(m.start(), m.end()).split("&")[0]);
}
m = patternB.matcher(part);
if (m.matches()) {
System.out.println(part.substring(m.start(), m.end()));
}
}
}
}
Can anyone please help me do the following in a java regular expression?
I need to read 3 characters from the 5th position from a given String ignoring whatever is found before and after.
Example : testXXXtest
Expected result : XXX
You don't need regex at all.
Just use substring: yourString.substring(4,7)
Since you do need to use regex, you can do it like this:
Pattern pattern = Pattern.compile(".{4}(.{3}).*");
Matcher matcher = pattern.matcher("testXXXtest");
matcher.matches();
String whatYouNeed = matcher.group(1);
What does it mean, step by step:
.{4} - any four characters
( - start capturing group, i.e. what you need
.{3} - any three characters
) - end capturing group, you got it now
.* followed by 0 or more arbitrary characters.
matcher.group(1) - get the 1st (only) capturing group.
You should be able to use the substring() method to accomplish this:
string example = "testXXXtest";
string result = example.substring(4,7);
This might help: Groups and capturing in java.util.regex.Pattern.
Here is an example:
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class Example {
public static void main(String[] args) {
String text = "This is a testWithSomeDataInBetweentest.";
Pattern p = Pattern.compile("test([A-Za-z0-9]*)test");
Matcher m = p.matcher(text);
if (m.find()) {
System.out.println("Matched: " + m.group(1));
} else {
System.out.println("No match.");
}
}
}
This prints:
Matched: WithSomeDataInBetween
If you don't want to match the entire pattern rather to the input string (rather than to seek a substring that would match), you can use matches() instead of find(). You can continue searching for more matching substrings with subsequent calls with find().
Also, your question did not specify what are admissible characters and length of the string between two "test" strings. I assumed any length is OK including zero and that we seek a substring composed of small and capital letters as well as digits.
You can use substring for this, you don't need a regex.
yourString.substring(4,7);
I'm sure you could use a regex too, but why if you don't need it. Of course you should protect this code against null and strings that are too short.
Use the String.replaceAll() Class Method
If you don't need to be performance optimized, you can try the String.replaceAll() class method for a cleaner option:
String sDataLine = "testXXXtest";
String sWhatYouNeed = sDataLine.replaceAll( ".{4}(.{3}).*", "$1" );
References
https://docs.oracle.com/javase/1.5.0/docs/api/java/lang/String.html
http://www.vogella.com/tutorials/JavaRegularExpressions/article.html#using-regular-expressions-with-string-methods
I'm trying to find all the occurrences of "Arrows" in text, so in
"<----=====><==->>"
the arrows are:
"<----", "=====>", "<==", "->", ">"
This works:
String[] patterns = {"<=*", "<-*", "=*>", "-*>"};
for (String p : patterns) {
Matcher A = Pattern.compile(p).matcher(s);
while (A.find()) {
System.out.println(A.group());
}
}
but this doesn't:
String p = "<=*|<-*|=*>|-*>";
Matcher A = Pattern.compile(p).matcher(s);
while (A.find()) {
System.out.println(A.group());
}
No idea why. It often reports "<" instead of "<====" or similar.
What is wrong?
Solution
The following program compiles to one possible solution to the question:
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class A {
public static void main( String args[] ) {
String p = "<=+|<-+|=+>|-+>|<|>";
Matcher m = Pattern.compile(p).matcher(args[0]);
while (m.find()) {
System.out.println(m.group());
}
}
}
Run #1:
$ java A "<----=====><<---<==->>==>"
<----
=====>
<
<---
<==
->
>
==>
Run #2:
$ java A "<----=====><=><---<==->>==>"
<----
=====>
<=
>
<---
<==
->
>
==>
Explanation
An asterisk will match zero or more of the preceding characters. A plus (+) will match one or more of the preceding characters. Thus <-* matches < whereas <-+ matches <- and any extended version (such as <--------).
When you match "<=*|<-*|=*>|-*>" against the string "<---", it matches the first part of the pattern, "<=*", because * includes zero or more. Java matching is greedy, but it isn't smart enough to know that there is another possible longer match, it just found the first item that matches.
Your first solution will match everything that you are looking for because you send each pattern into matcher one at a time and they are then given the opportunity to work on the target string individually.
Your second attempt will not work in the same manner because you are putting in single pattern with multiple expressions OR'ed together, and there are precedence rules for the OR'd string, where the leftmost token will be attempted first. If there is a match, no matter how minimal, the get() will return that match and continue on from there.
See Thangalin's response for a solution that will make the second work like the first.
for <======= you need <=+ as the regex. <=* will match zero or more ='s which means it will always match the zero case hence <. The same for the other cases you have. You should read up a bit on regexs. This book is FANTASTIC:
Mastering Regular Expressions
Your provided regex pattern String does work for your example: "<----=====><==->>"
String p = "<=*|<-*|=*>|-*>";
Matcher A = Pattern.compile(p).matcher(s);
while (A.find()) {
System.out.println(A.group());
}
However it is broken for some other examples pointed out in the answers such as input string "<-" yields "<", yet strangely "<=" yields "<=" as it should.