How to properly use pattern matching in java - java

Update:
I've found the solution thanks to #dasblinkenlight and all other good samaritans.
The working code is here for any of you with similar question:
Pattern pattern = Pattern.compile("(\\d+)(\\s)([-+*%/^])(\\s)(\\d+)");
Matcher matchOp1 = pattern.matcher(text);
matchOp1.find();
System.out.println(matchOp1.group(1));
This will only print the first group.
Original Question:
First and foremost, I cannot use any if statements, therefore I must catch and handle exceptions only.
Assume i have a string which contains "10 + 20".
I have the following regex: "(\d\+)(\s)([\+\-\*\%\^])(\s)([\d\+)".
This regex is intended to match (integer of any length)(space)(an operator)(space)(integer of any length)
Pattern pattern = Pattern.compile("(\\d\\+)(\\s)([\\+\\-\\*\\%\\^])(\\s)([\\d\\+)");
Matcher matchOp1 = pattern.matcher("1 + 1");
System.out.println(matchOp1.group(1));
I want this to print "10" only if there's a match, but this throws PatternSyntaxException. Can anyone give me some insight please?
Thank you!

You have an extra [ in your pattern, and you escaped pluses where you shouldn't have:
Pattern pattern = Pattern.compile("(\\d\\+)(\\s)([\\+\\-\\*\\%\\^])(\\s)([\\d\\+)");
// ^^ ^ ^^
Removing these will fix the problem.
Note that escaping meta-characters inside character class [...] is not necessary: just be careful to move - to one of the ends, and place ^ in any position other than first:
"(\\d+)(\\s)([-+*%/^])(\\s)(\\d+)"
Note that with all these unnecessary backslashes you forgot the division sign.

You've some issues in your regex
Pattern pattern = Pattern.compile("(\\d\\+)(\\s)([\\+\\-\\*\\%\\^])(\\s)([\\d\\+)");
^ ^ ^
You havent closed your square brackets
You should not escape + as it is there to indicate more than 1 digit, NOT literally +.
It will throw IllegalStateException so you have to place if(matchOp1.find()) before capturing group.
Instead, it should be like:
(\d+)(\s)([\+\-\*\%\^])(\s)(\d+)
and while using in code:
Pattern pattern = Pattern.compile("(\\d+)(\\s)([\\+\\-\\*\\%\\^])(\\s)(\\d+)");
Matcher matchOp1 = pattern.matcher("1 + 1");
if(matchOp1.find())
System.out.println(matchOp1.group(1));
DEMO

Related

Regex in java to extract specific pattern

I want to match the pattern (including the square brackets, equals, quotes)
[fixedtext="sometext"]
What would be a correct regex expression?
Anything can occur inside quotes. 'fixedtext' is fixed.
Your basic solution (although I'd be skeptical of this, per the comments) is essentially:
"\\[fixedtext=\\\"(.*)\\\"\\]"
which resolves to:
"\[fixedtext=\"(.*)\"\]"
Simple escaping of [] and quotes. The (.*) says capture everything in quotes as a capture group (matcher.group(1)).
But if you had a string of, for example '[fixedtext="abc\"]def"]' you'd get the an answer of abc\ instead of abc\"]def.
If you know the ending bracket ends the line, then use:
"\\[fixedtext=\\\"(.*)\\\"\\]$"
(add the $ at the end to mark end of line) and that should be fairly reliable.
My suggestion is using named-capturing groups.
You can find more details here:
https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html
Here's an example for your input:
String input = "[fixedtext=\"sometext\"]";
Pattern pattern = Pattern.compile("\\[(?<field>.*)=\"(?<value>.*)\"]");
Matcher matcher = pattern.matcher(input);
if (matcher.matches()) {
System.out.println(matcher.group("field"));
System.out.println(matcher.group("value"));
} else {
System.err.println(input + " doesn't match " + pattern);
}

Java Regex Look-Behind Doesn't Work

So I am working on regex comparing phone numbers and this is the result:
(?:(?:0{2}|\+)?([1-9][0-9]))? ?([1-9][0-9])? ?([1-9][0-9]{5})
As you can see there are spaces between the numbers. I want them to appear only when there is some other number before the space so:
"0022 45 432345" - should match
"45 345678" or "560032" - should match
" 324400" - shouldn't match because of the space in the beginning
I've been reading different tutorials about regexes and found out about look-behinds, but simple construction like that(just for test):
Pattern p2 = Pattern.compile("(?<=abc)aa");
Matcher m2 = p2.matcher("abcaa");
doesn't work.
Can you tell me what's wrong?
Another problem is - I want a character only happen when it is THE FIRST character in a string, otherwise it shouldn't occur. So the code:
0043 022 234567 should not work, but 022 123450 should match.
I'm stuck right now and would appreciate any help a lot.
This should work just fine. The spaces are moved into the optional groups and are themselves optional. This way, they only match if the group before them is present, but even then they are still optional. No look-behind required.
(?:(?:(?:00|\+)?([1-9][0-9]) ?)?([1-9][0-9]) ?)?([1-9][0-9]{5})
Lookbehind is a zero length match.
The javadoc for the Matcher.matches method determines if the whole String is a match.
What you're looking for is something the Matcher.find and Matcher.group methods. Something like:
final Pattern pattern = Pattern.compile("(?<=abc)aa");
final Matcher matcher = pattern.matcher("abaca");
final String subMatch;
if (matcher.find()) {
subMatch = matcher.group();
} else {
subMatch = "";
}
System.out.println(subMatch);
Example.

Setting up regex in Pattern object for java string

Am trying to get all matches in a java string. The matches must be bases and powers in a math equation. As you know, bases and powers could be negative and decimals as well. I have the pattern, regex and matcher set up. It looks something like this, but it is not giving me what I expect. I guided myself by this post here on StackOverflow Regex to find integer or decimal from a string in java in a single group?
Am really just interested in capturing powers that have negative exponents both integers and non-integers.
Well here is my code:
String ss = "2.5(4x+3)-2.548^-3.654=-14^-2.545";
String regex = "(\\d^+(?:\\d+)?)";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(ss);
while(m.find()){
System.out.println("Wiwi the cat: "+m.group(1));
}
The output of this code is nothing. Any ideas or suggestions would be great.
thanks
Your basic problem is this character: ^. It means "start of input", so your regex can't match anything.
You must escape it \^ so it becomes a literal.
I also fixed the rest of your regex:
String regex = "-?\\d+(\\.\\d*)?(\\^-?\\d+(\\.\\d*)?)?";
See live demo.
And use
match.group(); // the whole match
EDIT: replaced [0123456789] with (\\d)
EDIT EDIT: added context for OP
Answer
Here's a pattern that should match what you need:
-?(\\d)+(\\.(\\d)+)?\\^-?(\\d)+(\\.(\\d)+)?
Explanation
-? - zero or one minus symbols
(\\d)+ - one or more digits
(\\.(\\d)+)? - (optional) a decimal point, followed by one or more digits
\\^ - one caret symbol
Using this on your input with String.replace(pattern, "[FOUND]") produced:
"2.5(4x+3)[FOUND]=[FOUND]"
In the context of your answer, simply replace your regex with the one I posted, and use m.group() instead of m.group(1).
String ss = "2.5(4x+3)-2.548^-3.654=-14^-2.545";
String regex = "-?(\\d)+(\\.(\\d)+)?\\^-?(\\d)+(\\.(\\d)+)?";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(ss);
while(m.find()) {
System.out.println("Wiwi the cat: " + m.group());
}
Best of luck!

Java Regex to check "=number", ex "=5455"?

I want to check a string that matches the format "=number", ex "=5455".
As long as the fist char is "=" & the subsequence is any number in [0-9] (dot is not allowed), then it will popup "correct" message.
if(str.matches("^[=][0-9]+")){
Window.alert("correct");
}
So, is this ^[=][0-9]+ the correct one?
if it is not correct, can u provide a correct solution?
if it is correct, then can u find a better solution?
I'm no big regex expert and more knowledgeable people than me might correct this answer, but:
I don't think there's a point in using [=] rather than simply = - the [...] block is used to declare multiple choices, why declare a multiple choice of one character?
I don't think you need to use ^ (if your input string contains any character before =, it won't match anyway). I'm unsure as to whether its presence makes your regex faster, slower or has no effect.
In conclusion, I'd use =[0-9]+
That should be correct it is looking for an anchored at the beginning = sign and then 1 or more digits between 0-9
Your regex will work, even though it can be simplified:
.matches() does not really do regex matching, since it tries and matches all the input against the regex; therefore the beginning of input anchor is not needed;
you don't need the character class around the =.
Therefore:
if (str.matches("=[0-9]+")) { ... }
If you want to match a string which only begins with that regex, you have to use a Pattern, a Matcher and .find():
final Pattern p = Pattern.compile("^=[0-9]+");
final Matcher m = p.matcher(str);
if (m.find()) { ... }
And finally, Matcher also has .lookingAt() which anchors the regex only at the beginning of the input.

Java Regex Escape

I've got this bit of code to grab a url within a textarea. It has been working great until I tried a url with a '+' in it.
Pattern pattern = Pattern.compile("(.*)(https?[://.0-9-?a-z=_#!A-Z]*)(.*)");
Matcher matcher = pattern.matcher(text);
So I tried puting \\+ and \\\\+ in my code but it did not work. So i did some googling and stack overflow problems kept mentioning this guy
Pattern.quote("+");
However, I am not sure how I implement that statement into what I currently have now. If that is even the way I want to go. But I'm assuming I need to do something like this...
String quote = Pattern.quote("+");
Pattern pattern = Pattern.compile("(.*)(https?[://.0-9-?a-z=_#!A-Z]*)(.*)");
Matcher matcher = pattern.matcher(text);
And then add the variable quote somewhere in the pattern? Please help! I just learned this stuff today I'm brand new to it! Thank you?
just escape the quote with \, example
Pattern pattern = Pattern.compile("(.*)(https?[://.0-9-?a-z=_#!A-Z\"]*)(.*)");
(https?[://.0-9-?a-z=_#!A-Z]*)
Bear in mind that [ and ] denote a class of characters, and that this means that any character within it will be included. [aegl]+ will match "age", "a", "e", g", "eagle", and "gaggle". It also means that a character listed twice (like /) is completely redundant.
Pattern.quote is useful, but will only return the same string with a backslash preceding any special character. Pattern.quote("+") will return \+.
Because + has no significance between square brackets, you should be able to put a + unescaped within the square brackets. At that point you can also add a \\ if it makes you feel better.
Pattern pattern = Pattern.compile("(.*)(https?[:/.0-9-?a-z=_#!A-Z+]*)(.*)");
Pattern pattern = Pattern.compile("(.*)(https?[:/.0-9-?a-z=_#!A-Z\\+]*)(.*)");
See it here: http://fiddle.re/0780

Categories