Which regex to use?

Which regex to use? - java

i have expressions like :
-3-5
or -3--5
or 3-5
or 3-+5
or -3-+5
I need to extact the numbers , splitting on the "-" sign between them i.e in the above cases i would need,
-3 and 5, -3 and -5 , 3 and 5, 3 and +5 , -3 and +5.
I have tried using this:
String s[] = str.split("[+-]?\\d+\\-[+-]?\\d+");
int len = s.length;
for(int i=0;i<len;i++)System.out.println(s[i]);
but it's not working

Try to split with this regular expression:
str.split("\\b-")
The word boundary \b should only match before or after a digit so that in combination with - only the following - as the range indicator is matched:
-3-5, -3--5 , 3-5,3-+5,-3-+5
^ ^ ^ ^ ^

Crossposted to forums.sun.com.
This is not a job for REs by themselves. You need a scanner to return operators and numbers, and an expression parser. Consider -3-------5.

Your expression is pretty ok. Split is not the choice though since you are trying to match the expression to your string - not split the string using it:
Here is some code that can make use of your expression to obtain what you want:
String a = "-90--80";
Pattern x = Pattern.compile("([+-]?\\d+)\\-([+-]?\\d+)");
Matcher m = x.matcher(a);
if(m.find()){
System.out.println(m.group(1));
System.out.println(m.group(2));
}

Related

What regex should I use to check a string only has numbers and 2 special characters ( - and , ) in Java?

Scenario: I want to check whether string contains only numbers and 2 predefined special characters, a dash and a comma.
My string contains numbers (0 to 9) and 2 special characters: a dash (-) defines a range and a comma (,) defines a sequence.
Tried attempt :
Tried following regex [0-9+-,]+, but not working as expected.
Possible inputs :
1-5
1,5
1-5,6
1,3,5-10
1-5,6-10
1,3,5-7,8,10
The regex should not accept these types of strings:
-----
1--4
,1,5
5,6,
5,4,-
5,6-
-5,6
Please can any one help me to create regex for above scenario?

You may use
^\d+(?:-\d+)?(?:,\d+(?:-\d+)?)*$
See the regex demo
Regex details:
^ - start of string
\d+ - 1 or more digits
(?:-\d+)? - an optional sequence of - and 1+ digits
(?:,\d+(?:-\d+)?)* - zero or more seuqences of:
, - a comma
\d+(?:-\d+)? - same pattern as described above
$ - end of string.

Change your regex [0-9+-,]+ to [0-9,-]+
final String patternStr = "[0-9,-]+";
final Pattern p = Pattern.compile(patternStr);
String data = "1,3,5-7,8,10";
final Matcher m = p.matcher(data);
if (m.matches()) {
System.out.println("SUCCESS");
}else{
System.out.println("ERROR");
}

How create regex for delete all `"0"` at the beginning of string?

How delete all "0" at the beginning of string?
00011 -> 11
00123 -> 123
000101 -> 101
101 -> 101
000002500 -> 2500
I tried:
Pattern pattern = Pattern.compile("([1-9]{1}[0-9]?+)");
Matcher matcher = pattern.matcher("00049");
matcher.matches();
whatYouNeed = matcher.group();
I have error: No match found

I'd try
System.out.println("Status: " + "00012010003".replaceAll("^0+", ""));
or regex only:
yourString.replaceAll("^0+", "");
Where
^ - matches only at start of string
0 - matches literal zeroes
+ - matches consecutive zeroes (at least one)

If your String only contains digits as stated in your question. You can use String.valueOf(Integer.parseInt("00011"))

You should use replaceAll with ^0* and replace by empty string rather than finding a match.

You will have to use regex (?<=^)0+ with replaceFirst() for this.
But parse your value to string before regex if it is in another form.
String val = "000011100";
String newVal = val.replaceFirst("(?<=^)0+", "");
System.out.println(newVal);
Output :
11100
Where
?<=^ is a look behind. The regex pattern will match only 0's with ^ i.e. start of string behind them.

Java Regex to match repeated keywords

I need to filter a document if the caption is the same surname (i.e.,Smith Vs Smith or John Vs John etc.).
I am converting entire document into a string and validating that string against a regular expression.
Could any one help me to write a regular expression for the above case.

Backreferences.
Example: (\w+) Vs \1

If a had exactly understand your question: you have a string like this "X Vs Y" (Where X and Y are two names) and you want to know if X == Y.
In this case, a simple (\w+) regex can do it :
String input = "Smith Vs Smith";
// Build the Regex
Pattern p = Pattern.compile("(\\w+)");
Matcher m = p.matcher(input);
// Store the matches in a list
List<String> str = new ArrayList<String>();
while (m.find()) {
if (!m.group().equals("Vs"))
{
str.add(m.group());
}
}
// Test the matches
if (str.size()>1 && str.get(0).equals(str.get(1)))
System.out.println(" The Same ");
else System.out.println(" Not the Same ");

(\w+).*\1
This means: a word of 1 or more characters, signed as group 1, followed by anything, and followed by whatever group 1 is.
More explained: grouping (bracketing part of regex) and referencing to groups defined in the expression ( \1 does that here).
Example:
String s = "Stewie is a good guy. Stewie does no bad things";
s.find("(\\w+).*\\1") // will be true, and group 1 is the duplicated word. (note the additional java escape);

How to match just 1 or 2 chars with regex

i want regx to match any word of 2 or 1 characters example ( is , an , or , if, a )
i tried this :-
int scount = 0;
String txt = "hello everyone this is just test aa ";
Pattern p2 = Pattern.compile("\\w{1,2}");
Matcher m2 = p2.matcher(txt);
while (m2.find()) {
scount++;
}
but got wrong matches.

You probably want to use word boundary anchors:
Pattern p2 = Pattern.compile("\\b\\w{1,2}\\b");
These anchors match at the start/end of alphanumeric "words", that is, in positions before a \w character if there is no \w character before that, or after a \w character if there is no \w character after that.

I think that you should be a bit more descriptive. Your current code returns 15 from the variable scount. That's not nothing.
If you want to get a count of the 2 letter words, and that is excluding underscores, digits within this count, I think that you would be better off with negative lookarounds:
Pattern.compile("(?i)(?<![a-z])[a-z]{1,2}(?![a-z])");
With a string input of hello everyone this is just 1 test aa, you get the value of scount as 2 (is and aa) and not 3 (is, 1, aa) as you would have if you were looking for only 1 or 2 consecutive \w.
Also, with hello everyone this is just test aa_, you get a count of 1 with \w (is), but 2 (is, aa)with the lookarounds.

Pattern: how subtract matched character in character class?

Is it possible to subtract a matched character in a character class?
Java docs are having examples about character classes with subtraction:
[a-z&&[^bc]] - a through z, except for b and c: [ad-z] (subtraction)
[a-z&&[^m-p]] - a through z, and not m through p: [a-lq-z](subtraction)
I want to write pattern, which matches two pairs of word characters, when pairs are not the same:
1) "aaaa123" - should NOT match
2) "aabb123" - should match "aabb" part
3) "aa--123" - should NOT match
I am close to success with following pattern:
([\w])\1([\w])\2
but of course it does not work in case 1, so I need to subtract the match of first group. But when I try to do this:
Pattern p = Pattern.compile("([\\w])\\1([\\w&&[^\\1]])\\2");
I am getting an exception:
Exception in thread "main" java.util.regex.PatternSyntaxException: Illegal/unsupported escape sequence near index 17
([\w])\1([\w&&[^\1]])\2
^
at java.util.regex.Pattern.error(Pattern.java:1713)
So seems it does not work with groups, but just with listing specific characters. Following pattern compiles with no problems:
Pattern p = Pattern.compile("([\\w])\\1([\\w&&[^a]])\\2");
Is there any other way to write such pattern?

Use
Pattern p = Pattern.compile("((\\w)\\2(?!\\2))((\\w)\\4)");
Your characters will be in groups 1 and 3.
This works by using a negative lookahead, to make sure the character following the second character in the first character group is a different character.

You are using the wrong tool for the job. By all means use a regex to detect pairs of character pairs, but you can just use != to test whether the characters within the pairs are the same. Seriously, there is no reason to do everything in a regular expression - it makes for unreadable, non-portable code and brings you no benefit other than "looking cool".

Try this
String regex = "(\\w)\\1(?!\\1)(\\w)\\2";
Pattern pattern = Pattern.compile(regex);
(?!\\1) is a negative lookahead, it ensures that the content of \\1 is not following
My test code
String s1 = "aaaa123";
String s2 = "aabb123";
String s3 = "aa--123";
String s4 = "123ccdd";
String[] s = { s1, s2, s3, s4 };
String regex = "(\\w)\\1(?!\\1)(\\w)\\2";
for(String a : s) {
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(a);
if (matcher.find())
System.out.println(a + " ==> Success");
else
System.out.println(a + " ==> Failure");
}
The output
aaaa123 ==> Failure
aabb123 ==> Success
aa--123 ==> Failure
123ccdd ==> Success

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Which regex to use? - java

Try to split with this regular expression: str.split("\\b-") The word boundary \b should only match before or after a digit so that in combination with - only the following - as the range indicator is matched: -3-5, -3--5 , 3-5,3-+5,-3-+5 ^ ^ ^ ^ ^

Crossposted to forums.sun.com. This is not a job for REs by themselves. You need a scanner to return operators and numbers, and an expression parser. Consider -3-------5.

Related

What regex should I use to check a string only has numbers and 2 special characters ( - and , ) in Java?

How create regex for delete all `"0"` at the beginning of string?

Java Regex to match repeated keywords

How to match just 1 or 2 chars with regex

Pattern: how subtract matched character in character class?

Categories

Resources