Compare String to a Regex - java

I was asked in a job interview to write a function that gets a string r representing a regex, and an input string s, and to say whether the two match.
The regex may contain the following symbols:
a-z
. match any single character
* look at the previous character at the regex, and match it zero or more times. (the regexes re legal so the sequence .* can't appear in the regex)
The input string may contain only a-z
Using automata wasn't practical since I had to write the code to implement it in 30 minutes, and attempting to create an automata out of the regex would be out of scope (the interviewer said so too).
Also - due to the way * was defined, using the build in methods in java to check whether the regex matches the string aren't optional too (I assume).
I solved it in the following manner:
public static boolean isMatch(String regex, String s){
if(regex.length()==0 && s.length()==0)
return true;
int j=0,i=0;
boolean match=false;
for(;i<regex.length() && j<s.length() && !match;i++)
{
if(regex.charAt(i)!='.' && regex.charAt(i)!='*')
{
if(regex.charAt(i)!=s.charAt(j))
return false;
else
j++;
}
else if(regex.charAt(i)=='.')
{
j++;
}
else
{
char n=regex.charAt(i-1);
int mone= countChar(s, n, j);
for(int k=0;k<=mone && !match;k++)
{
match=isMatch(regex.substring(i+1),s.substring(j+k));
if(match)
return true;
}
}
}
return match || (j==s.length() && i==regex.length());
}
It works but the complexity is O(n!) when n is the length of the regex.
I would like to know whether there is another, more efficient way to solve this.

Related

Java regex require at least one letter and one digit. Also allow any special characters

I tried searching for a regex which can validate if a string contains at least one letter and one digit and there can be any special character as well with a minimum 8 length.
I tried below regex but it checking
One digit
One letter
# and - special symbol
(?=(?:.*[a-zA-Z]){1,})(?=(?:.*[#-]){0,})(?=(?:.*[0-9]){1,})^[a-zA-Z0-9#-]*$
But I want it can allow any special characters (Special chars are optional but at least one letter and one digit must be there in string.)
Don't use regex. It's easier just to iterate the string character-by-character:
boolean foundDigit = false;
boolean foundLetter = false;
for (int i = 0; i < str.length(); ++i) {
if (Character.isDigit(c)) { foundDigit = true; }
else if (Character.isLetter(c)) { foundLetter = true; }
}
return str.length() >= 8 && foundDigit && foundLetter;
The requirement of "optional special character" seems to be unnecessary to check, since you don't specify that the string can only contain certain characters, and it doesn't have to be there.
The same logic as #Andy Turner but using streams:
public static boolean validiate(String str){
return str.chars().count() >7 &&
str.chars().filter(c->Character.isLetter(c)).count()>1 &&
str.chars().filter(c->Character.isDigit(c)).count()>1;
}

How to make multiple inputs of a single character register as one character?

I'm unsure of the code for this, but if one were to input "oooooooooo" after a prompt (like in an if-statement or something where the program registers "o" as "one" or something), how could you make "oooooooooo" translate into "o"?
Would one have to write down manually various iterations of "o" (like, "oo" and "ooo" and "oooo"...etc.). Would it be similar to something like the ignore case method where O and o become the same? So "ooo..." and "o" end up as the same string.
Although probably overkill for this one use-case, it would be helpful to learn how to use regexes in the future. Java provides a regex library to use called Pattern. For example, the regex /o+ne/ would match any string "o...ne" with at least one "o".
using regex:
public static String getSingleCharacter(String input){
if(input == null || input.length() == 0) return null;
if(input.length() == 1) return input;
if(!input.toLowerCase().matches("^\\w*?(\\w)(?!\\1|$)\\w*$")){
return Character.toString(input.toLowerCase().charAt(0));
}
return null;
}
if the method returns null then the characters are not all the same, else it will return that single char represented as a string.
Use the regular expression /(.)\1+/ and String#replaceAll() to match runs of two or more of the same character and then replace the match with the value of the first match group identified with $1 as follows:
public static String squeeze(String input) {
return input.replaceAll("(.)\\1+", "$1");
}
String result = squeeze("aaaaa bbbbbbb cc d");
assert(result.equals("a b c d"));
public string condense(String input) {
if(input.length >= 3) {
for(int i=0; i< input.length-2; i++){
if(input.substring(i,i+1) != input.substring(i+1,i+2)){
return input;
}
}
}
return input.substring(0,1);
}
This checks if the string is 3 characters or longer, and if so it loops through the entire string. If every character in the string is the same, then it returns a condensed version of the string.

How can I look for two specific characters in a string?

String abc = "||:::|:|::";
It should return true if there's two | and three : appearances.
I'm not sure how to use "regex" or if it's the right method to use. There's no specific pattern in the abc String.
Using a regex would be a bad idea, especially if there's no specific order to them. Make a function that counts the number of times a character sppears in a string, and use that:
public int count(String base, char toFind)
{
int count = 0;
char[] haystack = base.toCharArray();
for (int i = 0; i < haystack.length; i++)
if (haystack[i] == toFind)
count++;
return count;
}
String abc = "||:::|:|::";
if (count(abc,"|") >= 2 && count(abc,":") >= 3)
{
//Do some code here
}
My favorite method for searching for the number of characters in a string is int num = s.length() - s.replaceAll("|","").length(); you can do that for both and test those ints.
If you want to test all conditions in one regex you can use look-ahead (?=condition).
Your regex can look like
String regex =
"(?=(.*[|]){2})"//contains two |
+ "(?=(.*:){3})"//contains three :
+ "[|:]+";//is build only from : and | characters
Now you can use it with matches like
String abc = "||:::|:|::";
System.out.println(abc.matches(regex));//true
abc = "|::::::";
System.out.println(abc.matches(regex));//false
Anyway I you can avoid regex and write your own method which will calculate number of | and : in your string and check if this numbers are greater or equal to 2 and 3. You can use StringUtils.countMatches from apache-commons so your test code could look like
public static boolean testString(String s){
int pipes = StringUtils.countMatches(s, "|");
int colons = StringUtils.countMatches(s, ":");
return pipes>=2 && colons>=3;
}
or
public static boolean testString(String s){
return StringUtils.countMatches(s, "|")>=2
&& StringUtils.countMatches(s, ":")>=3;
}
This is assuming you are looking for two '|' to be one after the other and the same for the three ':'
and one follows the other .Do it using the following single regular expressions.
".*||.*:::.*"
If you are looking to just check the presence of characters and their irrespective of their order then use String.matches method using the two regular expressions with a logical AND
".*|.*|.*"
".*:.*:.*:.*"
Here is a cheat sheet for regular expressions. Its fairly simple to learn. Look at groups and quantifiers in the document to understand the above expression.
Haven't tested it, but this should work
Pattern.compile("^(?=.*[|]{2,})(?=.*[:]{3,})$");
The entire string is read by ?=.* and checked wether the allowed characters (|) occurs at least twice. The same is then done for :, only that this has to match at least three times.

Checking if a string only contains certain characters

I have a string representing a 32 character long barcode made up of "|" and ":".
I want to check the validity of any given string to make sure it is a barcode. One of the tests is to check that the only symbols it contains are the two mentioned above. How can I check that?
I first I was using a delimiter, but I don't think that is the right way to go about this.
public boolean isValidBarCode (String barCode)
{
barCode.useDelimiter ("[|:]");
if (barCode.length() == 32)
{
return true;
}
else
{
return false;
}
I know there are other things I need to check in order to validate it as a barcode, but I'm asking only for the purposes of checking the symbols within the given string.
I'm a beginner programmer, so the help is greatly appreciated!
You can use a regex:
boolean correct = string.matches("[\\:\\|]+");
Explanation for the regex: it checks that the string is constituted of 1 or more characters (that's what the + suffix does) being either : or |. We would normally write [:|]+, but since : and (I think) | are special characters in regexes, they need to be escaped with a backslash. And backslashes must be escaped in a string literal, hence the double backslash.
Or you can simply code a 5 lines algorithm using a loop:
boolean correct = false;
for (int i = 0; i < string.length() && correct; i++) {
char c = string.charAt(i);
if (c != ':' && c != '|') {
correct = false;
}
}
Since you require the barcode to be exactly 32 characters long and consist only of the : and | characters, you should use a combination of length and regex checking:
boolean isCorrect = barCode.matches( "[\\|\\:]*" );
if(isCorrect && barCode.length() == 32) {
//true case
} else {
//false case
}
boolean isBarCode = barCode.matches( "[\\|\\:]*" );

Regex to check if a single quote is preceeded by another single quote

I would like to write a regex to validate if a single quote is preceeded by another single quote.
Valid strings:
azerty''uiop
aze''rty''uiop
''azertyuiop
azerty''uiop''
azerty ''uiop''
azerty''''uiop
azerty''''uiop''''
Invalid strings:
azerty'uiop
aze'rty'uiop
'azertyuiop
azerty'uiop'
azerty 'uiop'
azerty'''uiop
It can be done in one line:
inputString.matches("(?:[^']|'')*+");
The regex simply means, the string can contain 0 or more of
Non-quote character [^']
OR
A pair of consecutive quotes ''
I used possessive version (*+) of 0 or more quantifier (*). Since it would be lengthy to explain what possessive quantifier means, I will refer you to here to learn about it. Simply put, it is an optimization.
No need for a regex, just use .replace() to replace all sequences of two single quotes by nothing, then test whether you still find a single quote; if yes, the string is invalid:
if (input.replace("''", "").indexOf('\'') != -1)
// Not valid!
If you also want to consider that strings with no single quotes are valid, you'll have to create a temporary variable:
public boolean isValid(final String input)
{
final String s = input.replace("''", "");
return s.equals(input) ? true : s.indexOf('\'') == -1;
}
Do you want a very fast solution? Try the next:
public static boolean isValid(String str) {
char[] chars = str.toCharArray();
int found = 0;
for (int i = 0; i < chars.length; i++) {
char c = chars[i];
if (c == '\'') {
found++;
} else {
if (found > 0 && found % 2 != 0) {
return false;
}
found = 0;
}
}
if (found > 0 && found % 2 != 0) {
return false;
}
return true;
}
You can use the code bellow too:
str.matches("([^\']*(\'){2}[^\']*)+");
I think "([^\']*(\'){2}[^\']*)+" is easy to grasp, for the beginners. But this is not the best way to do this. It dies (runs into backtracking hell) when running for long input.

Categories