I'm new to Hacker Rank and I'm currently solving problems in the java stack, I tried to solve this algorithm:
A string containing only parentheses is balanced if the following is true: 1. if it is an empty string 2. if A and B are correct, AB is correct, 3. if A is correct, (A) and {A} and [A] are also correct.
Examples of some correctly balanced strings are: "{}()", "[{()}]", "({()})"
Examples of some unbalanced strings are: "{}(", "({)}", "[[", "}{" etc.
Given a string, determine if it is balanced or not.
And I found the following one liner solution which I couldn't understand can someone explain it please?
class Solution{
public static void main(String []argh)
{
Scanner sc = new Scanner(System.in);
while (sc.hasNext()) {
String input=sc.next();
while(input.length() != (input = input.replaceAll("\\(\\)|\\[\\]|\\{\\}", "")).length());
System.out.println(input.isEmpty());
}
}
}
The string "\\(\\)|\\[\\]|\\{\\}" given to replaceAll is a regular expression. Half of the backslashes are needed because all of ()[]{} have special meaning in a regex; the other half are needed to escape those backslashes because \ also has special meaning in a string.
Ignoring the backslashes, the pattern is ()|[]|{}, which will match any of the substrings (), [] and {}. The replaceAll call then removes all matches of these by replacing them with the empty string "". This is then repeated until no more matches can be replaced.
On balanced strings, and only on balanced strings, this eventually produces an empty string by removing empty pairs from the inside out. Let's look at an example:
[{()}]()[{}]
^^ ^^ ^^ <- these matches are removed
[{}][]
^^ ^^ <- then these are removed
[]
^^ <- and finally this one
There is some more obfuscation going on with the way the while loop is written:
while(input.length() != (input = input.replaceAll(...)).length());
To understand this, you need to know that = performs an assignment, but also evaluates to the assigned value. And you need to know that Java always evaluates subexpressions from left to right.
So first, input.length() is evaluated, producing the length of the original string. Then (input = input.replaceAll(...)).length() is evaluated, which does two things: it assigns the next string to input, and it returns the length of that next string.
Finally, the two lengths are compared. If equal, the loop terminates because nothing more can be replaced. If not equal, it means that some matching pair has been removed, and we will do another iteration, now with the new value of input.
Finally, we just check whether the resulting string is empty:
System.out.println(input.isEmpty());
replaceAll method needs 2 parameters (regex, replacement);
you need to understand regex:
the \\ means whatever the begin is.
(\\) the begin must to be '(' and the end must be ')' and whatever between them and like that for rest from regex.
the solution is replacing regex with empty String.
so if your input = (1,2)3( it will be after replacing 3(
Related
for a test I created following regex by mistake:
|(\\w+)|
I was puzzled that this regex really works and I can't explain the result:
public static void main(String[] args) {
String toReplace="Hey I'm a lovely String an I'm giving my |value| worth!";
// String replacement1="2 cent"; // I planned to replace |value| with 2 cent
String replacement1="#"; // to produce a better Output
String regex="|(\\w+)|"; // I forgot to escape the |
replacement1="#";
result=toReplace.replaceAll(regex,replacement1);
System.out.println(result);
}
the result is:
#H#e#y# #I#'#m# #a# #l#o#v#e#l#y# #S#t#r#i#n#g# #a#n# #I#'#m# #g#i#v#i#n#g# #m#y# #|#v#a#l#u#e#|# #w#o#r#t#h#!#
My ideas so far are that java tries to replace "nothing" between the characters but why not the characters itself?
\\w+ should match the 'H'
I would expect that every char is replaced by 3 # signs or only by one but that the characters are not replaced puzzles me.
You're right, this regex matches the empty string between each character.
Since the first alternative (the empty string left of |) matches, the rest of the pattern isn't even tried, so the \w+ isn't even reached by the matching engine. You could have written any (valid) pattern to the right of that first |, it wouldn't ever be reached.
The engine works the following way: It has a current position cursor in the subject string. It tries to match starting at that current position. Since your regex is a match, it will perform the replacement at this point, and then move the current position cursor after the found match.
But since the match is zero-width, it simply advances to the next character, because not doing so would result in an infinite loop.
I am weak in writing regular expressions so I'm going to need some help on the one. I need a regular expression that can validate that a string is an set of alphabets (the alphabets must be unique) delimited by comma.
Only one character and after that a comma
Examples:
A,E,R
R,A
E,R
Thanks
You can use a repeated group to validate it's a comma separated string.
^[AER](?:,[AER])*$
To not have unique characters, you would do something like:
^([AER])(?:,(?!\1)([AER])(?!.*\2))*$
If I understand it correctly, a valid string will be a series (possibly zero long) of two-character patterns, where each pattern is a letter followed by a comma; finally followed at the end by one letter.
Thus:
"^([A-Za-z],)*[A-Za-z]$"
EDIT: Since you've clarified that the letters have to be A, E, or R:
"^([AER],)*[AER]$"
Something like this "^([AER],)*[AER]$"
#Edit: regarding the uniqueness, if you can drop the "last character cannot be a comma" requirement (which can be checked before the regex anyway in constant time) then this should work:
"^(?:([AER],?)(?!.*\\1))*$"
This will match A,E,R, hence you need that check before performing the regex. I do not take responsibility for the performance but since it's only 3 letters anyway...
The above is a java regex obviously, if you want a "pure one" ^(?:([AER],?)(?!.*\1))*$
#Edit2: sorry, missed one thing: this actually requires that check and then you need to add a comma at the end since otherwise it will also match A,E,E. Kind of limited I know.
My own ugly but extensible solution, which will disallow leading and trailing commas, and checks that the characters are unique.
It uses forward-declared backreference: note how the second capturing group is behind the reference made to it (?!.*\2). On the first repetition, since the second capturing group hasn't captured anything, Java treats any attempt to reference text match by second capturing group as failure.
^([AER])(?!.*\1)(?:,(?!.*\2)([AER]))*+$
Demo on regex101 (PCRE flavor has the same behavior for this case)
Demo on RegexPlanet
Test cases:
A,E,R
A,R,E
E,R,A
A
R,E
R
E
A,
A,R,
A,A,R
E,A,E
A,E,E
X,R,E
R,A,E,
,A
AA,R,E
Note: I'm going to answer the original question. That is, I don't care if the elements repeat.
We've had several suggestions for this regex:
^([AER],)*[AER]$
Which does indeed work. However, to match a String, it first has to back up one character because it will find that there is no , at the end. So we switch it for this to increase performance:
^[AER](,[AER])*$
Notice that this will match a correct String the very first time it attempts to. But also note that we don't need to worry about the ( )* backing up at all; it will either match the first time, or it won't match the String at all. So we can further improve performance by using a possessive quantifier:
^[AER](,[AER])*+$
This will take the whole String and attempt to match it. If it fails, then it stops, saving time by not doing useless backing up.
If I were trying to ensure the String had no repeated elements, I would not use regex; it just complicates things. You end up with less-readable code (sadly, most people don't understand regex) and, oftentimes, slower code. So I would build my own validator:
public static boolean isCommaDelimitedSet(String toValidate, HashSet<Character> toMatch) {
for (int index = 0; index < toValidate.length(); index++) {
if (index % 2 == 0) {
if (!toMatch.contains(toValidate.charAt(index))) return false;
} else {
if (toValidate.charAt(index) != ',') return false;
}
}
return true;
}
This assumes that you want to be able to pass in a set of characters that are allowed. If you don't want that and have explicit chars you want to match, change the contents of the if (index % 2 == 0) block to:
char c = toValidate.charAt(index);
if (c == 'A' || c == 'E' || c == 'R' || /* and so on */ ) return false;
I want to split my string on every occurrence of an alpha-beta character.
for example:
"s1l1e13" to an array of: ["s1","l1","e13"]
when trying to use this simple split by regex i get some weird results:
testStr = "s1l1e13"
Arrays.toString(testStr.split("(?=[a-z])"))
gives me the array of:
["","s1","l1","e13"]
how can i create the split without the empty array element?
I tried a couple more things:
testStr = "s1"
Arrays.toString(testStr.split("(?=[a-z])"))
does return the currect array: ["s1"]
but when trying to use substring
testStr = "s1l1e13"
Arrays.toString(testStr.substring(1).split("(?=[a-z])")
i get in return ["1","l1","e13"]
what am i missing?
Your Lookahead marks each position before any character of a to z; marking the following positions:
s1 l1 e13
^ ^ ^
So by spliting using just the Lookahead, it returns ["", "s1", "l1", "e13"]
You can use a Negative Lookbehind here. This looks behind to see if there is not the beginning of the string.
String s = "s1l1e13";
String[] parts = s.split("(?<!\\A)(?=[a-z])");
System.out.println(Arrays.toString(parts)); //=> [s1, l1, e13]
Your problem is that (?=[a-z]) means "place before [a-z]" and in your text
s1l1e13
you have 3 such places. I will mark them with |
|s1|l1|e13
so split (unfortunately correctly) produces "" "s1" "l1" "e13" and doesn't automatically remove for you first empty elements.
To solve this problem you have at least two options:
make sure that there is something before your place you need to split on (it is not at start of your string). You can use for instance (?<=\\d)(?=[a-z]) if you want to split after digit but before character
(PREFFERED SOLUTION) start using Java 8 which automatically removes empty strings at start of result array if regex used on split is zero-length (look-arounds are zero length).
The first match finds "" to be okay because its looking ahead for any alpha character, which is called zero-width lookahead, so it doesn't need to actually match anything. So "s" at the beginning is alphanumeric, and it matches that at a probable spot.
If you want the regex to match something always, use ".+(?=[a-z])"
The problem is that the initial "s" counts as an alphabetic character. So, the regex is trying to split at s.
The issue is that there is nothing before the s, so the regex machine instead decides to show that there is nothing by adding the null element. It'll do the same thing at the end if you ended with "s" (or any other letter).
If this is the only string you're splitting, or if every array you had starts with a letter but does not end with one, just truncate the array to omit the first element. Otherwise, you'll probably need to loop through each array as you make it so that you can drop empty elements.
So it seems your matches has the pattern x###, where x is a letter, and # is a number.
I'd make the following Regex:
([a-z][0-9]+)
I have a string of "abc123(" and want to check if contains one or more chars that are not a number or character.
"abc123(".matches("[^a-zA-Z0-9]+"); should return true in this case? But it dose not! Whats wrong?
My test script:
public class NewClass {
public static void main(String[] args) {
if ("abc123(".matches("[^a-zA-Z0-9]+")) {
System.out.println("true");
}
}
}
In Java, the expressions has to match the entire string, not just part of it.
myString.matches("regex") returns true or false depending whether the
string can be matched entirely by the regular expression. It is
important to remember that String.matches() only returns true if the
entire string can be matched. In other words: "regex" is applied as if
you had written "^regex$" with start and end of string anchors. Source
Your expression is looking for part of the string, not the whole thing. You can change your expression to .*YOUR_EXPRESSION.* and it will expand to match the entire string.
Rather than checking to see if it contains only letters and numbers, why not check to see if it contains anything other than that? You can use the not word group (\W) and if that returns true than you know the string contains something other than the characters you are looking for,
"abc123(".matches("[\W]");
If this returns true than there is something other than just word characters and digits.
Expression [^A-Za-z0-9]+ means 'not letters or digits'. You probably want to replace it with ^[A-Za-z0-9]+$ which means 'Only letters or digits'.
I have this homework problem where I need to use regex to remove every other character in a string.
In one part, I have to delete characters at index 1,3,5,... I have done this as follows:
String s = "1a2b3c4d5";
System.out.println(s.replaceAll("(.).", "$1"));
This prints 12345 which is what I want. Essentially I match two characters at a time, and replacing with the first character. I used group capturing to do this.
The problem is, I'm having trouble with the second part of the homework, where I need to delete characters at index 0,2,4,...
I have done the following:
String s = "1a2b3c4d5";
System.out.println(s.replaceAll(".(.)", "$1"));
This prints abcd5, but the correct answer must be abcd. My regex is only incorrect if the input string length is odd. If it's even, then my regex works fine.
I think I'm really close to the answer, but I'm not sure how to fix it.
You are indeed very close to the answer: just make matching the second char optional.
String s = "1a2b3c4d5";
System.out.println(s.replaceAll(".(.)?", "$1"));
// prints "abcd"
This works because:
Regex is greedy by default, it will take the second character if it's there
When the input is of odd length, the second char won't be there at the last replacement, but you'd still match one char (i.e. last char in input)
You can still use backreferences in substitution even if the group fails to match
It will substitute in the empty string, not "null"
This is different from Matcher.group(int), which returns null for failed groups
References
regular-expressions.info/Optional
A closer look at the first part
Let's take a closer look at the first part of the homework:
String s = "1a2b3c4d5";
System.out.println(s.replaceAll("(.).", "$1"));
// prints "12345"
Here you didn't have to use ? for the second char, but it "works" because even though you didn't match the last char, you didn't have to! The last char can remain unmatched, unreplaced, due to the problem specification.
Now suppose that we want to delete chars at index 1,3,5..., and put the chars at index 0,2,4... in brackets.
String s = "1a2b3c4d5";
System.out.println(s.replaceAll("(.).", "($1)"));
// prints "(1)(2)(3)(4)5"
A-ha!! Now you're experiencing the exact same problem with odd-length input! You couldn't match the last char with your regex, because your regex needs two chars, but there's only one char at the end for odd-length input!
The solution, again, is to make matching the second char optional:
String s = "1a2b3c4d5";
System.out.println(s.replaceAll("(.).?", "($1)"));
// prints "(1)(2)(3)(4)(5)"
my regex is only incorrect if the input string length is odd. if it's even, then my regex works fine.
Change your expresion to .(.)? - the question mark makes the second character optional, which means it doesn't matter if input is odd or even
Your regex needs 2 chars to match, so fails on the final char.
This regex:
".(.{0,1})"
Will make the second char optional, so it will match with your final '5' as well