Java - regex not working - java

I have a string of "abc123(" and want to check if contains one or more chars that are not a number or character.
"abc123(".matches("[^a-zA-Z0-9]+"); should return true in this case? But it dose not! Whats wrong?
My test script:
public class NewClass {
public static void main(String[] args) {
if ("abc123(".matches("[^a-zA-Z0-9]+")) {
System.out.println("true");
}
}
}

In Java, the expressions has to match the entire string, not just part of it.
myString.matches("regex") returns true or false depending whether the
string can be matched entirely by the regular expression. It is
important to remember that String.matches() only returns true if the
entire string can be matched. In other words: "regex" is applied as if
you had written "^regex$" with start and end of string anchors. Source
Your expression is looking for part of the string, not the whole thing. You can change your expression to .*YOUR_EXPRESSION.* and it will expand to match the entire string.

Rather than checking to see if it contains only letters and numbers, why not check to see if it contains anything other than that? You can use the not word group (\W) and if that returns true than you know the string contains something other than the characters you are looking for,
"abc123(".matches("[\W]");
If this returns true than there is something other than just word characters and digits.

Expression [^A-Za-z0-9]+ means 'not letters or digits'. You probably want to replace it with ^[A-Za-z0-9]+$ which means 'Only letters or digits'.

Related

Facing problems understanding the following algorithm solution

I'm new to Hacker Rank and I'm currently solving problems in the java stack, I tried to solve this algorithm:
A string containing only parentheses is balanced if the following is true: 1. if it is an empty string 2. if A and B are correct, AB is correct, 3. if A is correct, (A) and {A} and [A] are also correct.
Examples of some correctly balanced strings are: "{}()", "[{()}]", "({()})"
Examples of some unbalanced strings are: "{}(", "({)}", "[[", "}{" etc.
Given a string, determine if it is balanced or not.
And I found the following one liner solution which I couldn't understand can someone explain it please?
class Solution{
public static void main(String []argh)
{
Scanner sc = new Scanner(System.in);
while (sc.hasNext()) {
String input=sc.next();
while(input.length() != (input = input.replaceAll("\\(\\)|\\[\\]|\\{\\}", "")).length());
System.out.println(input.isEmpty());
}
}
}
The string "\\(\\)|\\[\\]|\\{\\}" given to replaceAll is a regular expression. Half of the backslashes are needed because all of ()[]{} have special meaning in a regex; the other half are needed to escape those backslashes because \ also has special meaning in a string.
Ignoring the backslashes, the pattern is ()|[]|{}, which will match any of the substrings (), [] and {}. The replaceAll call then removes all matches of these by replacing them with the empty string "". This is then repeated until no more matches can be replaced.
On balanced strings, and only on balanced strings, this eventually produces an empty string by removing empty pairs from the inside out. Let's look at an example:
[{()}]()[{}]
^^ ^^ ^^ <- these matches are removed
[{}][]
^^ ^^ <- then these are removed
[]
^^ <- and finally this one
There is some more obfuscation going on with the way the while loop is written:
while(input.length() != (input = input.replaceAll(...)).length());
To understand this, you need to know that = performs an assignment, but also evaluates to the assigned value. And you need to know that Java always evaluates subexpressions from left to right.
So first, input.length() is evaluated, producing the length of the original string. Then (input = input.replaceAll(...)).length() is evaluated, which does two things: it assigns the next string to input, and it returns the length of that next string.
Finally, the two lengths are compared. If equal, the loop terminates because nothing more can be replaced. If not equal, it means that some matching pair has been removed, and we will do another iteration, now with the new value of input.
Finally, we just check whether the resulting string is empty:
System.out.println(input.isEmpty());
replaceAll method needs 2 parameters (regex, replacement);
you need to understand regex:
the \\ means whatever the begin is.
(\\) the begin must to be '(' and the end must be ')' and whatever between them and like that for rest from regex.
the solution is replacing regex with empty String.
so if your input = (1,2)3( it will be after replacing 3(

In java regular expression how to split by dot but excluding if it contains backslash

In java, I am having a string containing dot and I want to split the string by dot but how to exclude if it contains backslash.
public class test {
public static void main(String[] args) {
String s1 ="test.env.PM1/.0";
System.out.println(Arrays.asList(s1.split("[.]")));//[test, env, PM1/, 0]
}
}
output expected:
[test, env, PM1/.0]
So, how can to exclude splitting if dot followed by a backslash is there.
You'll want to use a negative look-behind assertion to ensure it does not have a preceding forward slash.
String s1 ="test.env.PM1/.0";
System.out.println(Arrays.asList(s1.split("(?<!/)\\.")));
// [test, env, PM1/.0]
Try it online here
For more explanation see regular-expressions.info (emphasis mine)
Lookahead and lookbehind, collectively called "lookaround", are zero-length assertions just like the start and end of line, and start and end of word anchors explained earlier in this tutorial. The difference is that lookaround actually matches characters, but then gives up the match, returning only the result: match or no match. That is why they are called "assertions". They do not consume characters in the string, but only assert whether a match is possible or not. Lookaround allows you to create regular expressions that are impossible to create without them, or that would get very longwinded without them.
... Negative lookahead is indispensable if you want to match something not followed by something else.

String matches() method not working properly

So I have this method:
void verifySecretKey(String userEnters, Scanner input){
while(true) {
System.out.print("Enter the secret key: ");
userEnters = input.nextLine();
System.out.println("\nVerifying Secret Key...");
if (secretKey.matches(userEnters)) {
System.out.println("Secret key verified!");
break; }
else {
System.out.println("The secret key does not follow the proper format!"); }
}
}
and for some reason, it is not working properly. A string secretKey is automatically generated for the user and they must enter the exact string to be verified. However, even if the correct string was entered, it still says that it's incorrect.
Sometimes it works, and mostly it doesn't. I am wondering what I am doing wrong here?
String#matches accepts a string defining a regular expression. If you want to check for equality, use equals, not matches.
"oH-?bt-4#" contains a ?, which is a special character in regular expressions, not a literal ?. So the string doesn't match the regular expression.
Matches takes a regular expression as the argument. In the screenshot, you entered oH-?bt-4#, which contains a ?. This character has a special meaning in a regex. If you want to use the String#match method, you have to escape all the special characters, e. g. using Pattern.quote:
if (secretKey.matches(Pattern.quote(userEnters))) //...
Since your goal seems to be to check whether the two strings are the same, you could just use the String#equals method:
if (secretKey.equals(userEnters)) //...
When you don't have a reason to choose the regex-method matches, you should stick with equals, since it's more efficient.
According to the Javadoc,
public boolean matches(String regex)
Tells whether or not this string matches the given regular expression.
Now, "Java".matches("Java") is true, because the regex Java is a match for Java.
However there are lots of regexs that don't match themselves, and you're quite likely to find one if you generate strings randomly.
For example "a+bc".matches("a+bc") returns false -- because there's nothing there that matches the literal character + (a+ matches one-or-more as).
It's also very likely that a random string will result in something that can't be compiled as a regex, in which case your code will throw a PatternSyntaxException -- for example a[bc will do this because of an unmatched brace.
To test whether two strings are exactly the same, use .equals().

How to get Pattern.matches(regex, str) to return false when str is dot (".")?

I would like to check if a string is a mathematical operator (+,-,*,/). I'm using the matches method to check the character against a regex but it always returns true when checking a string that contains only a dot ("."). Here's the code:
String dot = ".";
if(dot.matches("[*+-/]"))
System.out.println("BAD");
else
System.out.println("GOOD");
This prints "BAD". I get that it probably has to do with the fact that "." in regex matches everything but I don't see why that would make a difference. Is there any way to get this to return false? Thanks.
No, the String you invoke matches on is not considered a regular expression. It is taken literally.
Your case is printing BAD, because this [*+-/] is a character class where . falls between + and /. Move the - to the end so that it doesn't create a range, [*+/-].
I'm going to suggest a tool for going about this.
Try regexr, it's colorful, it's got help on the sidebar, and you will be able to write regexes better with all the cases you want and do not want to match.
To get you started, check out the really rudimentary regex written here: http://regexr.com/3af78.
\d [*+/-] \d
As I do not know how strict or loose you want your check to be, I've added additional strings that you may or may not want to consider.

Regular Expression to match more than one occurrence of a character

I need help coming up with a regular expression to match if a string has more than one occurrence of character. I already validated the length of the two strings and they will always be equal. Heres what i mean, for example. The string "aab" and "abb". These two should match the regular expression because they have repeating characters, the "aa" in the first string and the "bb" in the second.
Since you say "aba"-style repetition doesn't count, back-references should make this simple:
(.)\1+
Would find sequences of characters. Try it out:
java.util.regex.Pattern.compile("(.)\\1+").matcher("b").find(); // false
java.util.regex.Pattern.compile("(.)\\1+").matcher("bbb").find(); // true
If you're checking anagrams maybe a different algorithm would be better.
If you sort your strings (both the original and the candidate), checking for anagrams can be done with a string comparison.
static final String REGEX_MORE_THAN_ONE_OCCURANCE_OF_B = "([b])\\1{1,}";
static final String REGEX_MORE_THAN_ONE_OCCURANCE_OF_B_AS_PREFIX_TO_A = "(b)\\1+([a])";

Categories