Checking if a string only contains certain characters

Checking if a string only contains certain characters - java

I have a string representing a 32 character long barcode made up of "|" and ":".
I want to check the validity of any given string to make sure it is a barcode. One of the tests is to check that the only symbols it contains are the two mentioned above. How can I check that?
I first I was using a delimiter, but I don't think that is the right way to go about this.
public boolean isValidBarCode (String barCode)
{
barCode.useDelimiter ("[|:]");
if (barCode.length() == 32)
{
return true;
}
else
{
return false;
}
I know there are other things I need to check in order to validate it as a barcode, but I'm asking only for the purposes of checking the symbols within the given string.
I'm a beginner programmer, so the help is greatly appreciated!

You can use a regex:
boolean correct = string.matches("[\\:\\|]+");
Explanation for the regex: it checks that the string is constituted of 1 or more characters (that's what the + suffix does) being either : or |. We would normally write [:|]+, but since : and (I think) | are special characters in regexes, they need to be escaped with a backslash. And backslashes must be escaped in a string literal, hence the double backslash.
Or you can simply code a 5 lines algorithm using a loop:
boolean correct = false;
for (int i = 0; i < string.length() && correct; i++) {
char c = string.charAt(i);
if (c != ':' && c != '|') {
correct = false;
}
}

Since you require the barcode to be exactly 32 characters long and consist only of the : and | characters, you should use a combination of length and regex checking:
boolean isCorrect = barCode.matches( "[\\|\\:]*" );
if(isCorrect && barCode.length() == 32) {
//true case
} else {
//false case
}

boolean isBarCode = barCode.matches( "[\\|\\:]*" );

Related

Regex to consolidate multiple rules

I'm looking at optimising my string manipulation code and consolidating all of my replaceAll's to just one pattern if possible
Rules -
strip all special chars except -
replace space with -
condense consecutive - 's to just one -
Remove leading and trailing -'s
My code -
public static String slugifyTitle(String value) {
String slugifiedVal = null;
if (StringUtils.isNotEmpty(value))
slugifiedVal = value
.replaceAll("[ ](?=[ ])|[^-A-Za-z0-9 ]+", "") // strips all special chars except -
.replaceAll("\\s+", "-") // converts spaces to -
.replaceAll("--+", "-"); // replaces consecutive -'s with just one -
slugifiedVal = StringUtils.stripStart(slugifiedVal, "-"); // strips leading -
slugifiedVal = StringUtils.stripEnd(slugifiedVal, "-"); // strips trailing -
return slugifiedVal;
}
Does the job but obviously looks shoddy.
My test assertions -
Heading with symbols *~!##$%^&()_+-=[]{};',.<>?/ ==> heading-with-symbols
Heading with an asterisk* ==> heading-with-an-asterisk
Custom-id-&-stuff ==> custom-id-stuff
--Custom-id-&-stuff-- ==> custom-id-stuff

Disclaimer: I don't think a regex approach to this problem is wrong, or that this is an objectively better approach. I am merely presenting an alternative approach as food for thought.
I have a tendency against regex approaches to problems where you have to ask how to solve with regex, because that implies you're going to struggle to maintain that solution in the future. There is an opacity to regexes where "just do this" is obvious, when you know just to do this.
Some problems typically solved with regex, like this one, can be solved using imperative code. It tends to be more verbose, but it uses simple, apparent, code constructs; it's easier to debug; and can be faster because it doesn't involve the full "machinery" of the regex engine.
static String slugifyTitle(String value) {
boolean appendHyphen = false;
StringBuilder sb = new StringBuilder(value.length());
// Go through value one character at a time...
for (int i = 0; i < value.length(); i++) {
char c = value.charAt(i);
if (isAppendable(c)) {
// We have found a character we want to include in the string.
if (appendHyphen) {
// We previously found character(s) that we want to append a single
// hyphen for.
sb.append('-');
appendHyphen = false;
}
sb.append(c);
} else if (requiresHyphen(c)) {
// We want to replace hyphens or spaces with a single hyphen.
// Only append a hyphen if it's not going to be the first thing in the output.
// Doesn't matter if this is set for trailing hyphen/whitespace,
// since we then never hit the "isAppendable" condition.
appendHyphen = sb.length() > 0;
} else {
// Other characters are simply ignored.
}
}
// You can lowercase when appending the character, but `Character.toLowerCase()`
// recommends using `String.toLowerCase` instead.
return sb.toString().toLowerCase(Locale.ROOT);
}
// Some predicate on characters you want to include in the output.
static boolean isAppendable(char c) {
return (c >= 'A' && c <= 'Z')
|| (c >= 'a' && c <= 'z')
|| (c >= '0' && c <= '9');
}
// Some predicate on characters you want to replace with a single '-'.
static boolean requiresHyphen(char c) {
return c == '-' || Character.isWhitespace(c);
}
(This code is wildly over-commented, for the purpose of explaining it in this answer. Strip out the comments and unnecessary things like the else, it's actually not super complicated).

Consider the following regex parts:
Any special chars other than -: [\p{S}\p{P}&&[^-]]+ (character class subtraction)
Any one or more whitespace or hyphens: [^-\s]+ (this will be used to replace with a single -)
You will still need to remove leading/trailing hyphens, it will be a separate post-processing step. If you wish, you can use a ^-+|-+$ regex.
So, you can only reduce this to three .replaceAll invocations keeping the code precise and readable:
public static String slugifyTitle(String value) {
String slugifiedVal = null;
if (value != null && !value.trim().isEmpty())
slugifiedVal = value.toLowerCase()
.replaceAll("[\\p{S}\\p{P}&&[^-]]+", "") // strips all special chars except -
.replaceAll("[\\s-]+", "-") // converts spaces/hyphens to -
.replaceAll("^-+|-+$", ""); // remove trailing/leading hyphens
return slugifiedVal;
}
See the Java demo:
List<String> strs = Arrays.asList("Heading with symbols *~!##$%^&()_+-=[]{};',.<>?/",
"Heading with an asterisk*",
"Custom-id-&-stuff",
"--Custom-id-&-stuff--");
for (String str : strs)
System.out.println("\"" + str + "\" => " + slugifyTitle(str));
}
Output:
"Heading with symbols *~!##$%^&()_+-=[]{};',.<>?/" => heading-with-symbols
"Heading with an asterisk*" => heading-with-an-asterisk
"Custom-id-&-stuff" => custom-id-stuff
"--Custom-id-&-stuff--" => custom-id-stuff
NOTE: if your strings can contain any Unicode whitespace, replace "[\\s-]+" with "(?U)[\\s-]+".

Java regex require at least one letter and one digit. Also allow any special characters

I tried searching for a regex which can validate if a string contains at least one letter and one digit and there can be any special character as well with a minimum 8 length.
I tried below regex but it checking
One digit
One letter
# and - special symbol
(?=(?:.*[a-zA-Z]){1,})(?=(?:.*[#-]){0,})(?=(?:.*[0-9]){1,})^[a-zA-Z0-9#-]*$
But I want it can allow any special characters (Special chars are optional but at least one letter and one digit must be there in string.)

Don't use regex. It's easier just to iterate the string character-by-character:
boolean foundDigit = false;
boolean foundLetter = false;
for (int i = 0; i < str.length(); ++i) {
if (Character.isDigit(c)) { foundDigit = true; }
else if (Character.isLetter(c)) { foundLetter = true; }
}
return str.length() >= 8 && foundDigit && foundLetter;
The requirement of "optional special character" seems to be unnecessary to check, since you don't specify that the string can only contain certain characters, and it doesn't have to be there.

The same logic as #Andy Turner but using streams:
public static boolean validiate(String str){
return str.chars().count() >7 &&
str.chars().filter(c->Character.isLetter(c)).count()>1 &&
str.chars().filter(c->Character.isDigit(c)).count()>1;
}

How to make multiple inputs of a single character register as one character?

I'm unsure of the code for this, but if one were to input "oooooooooo" after a prompt (like in an if-statement or something where the program registers "o" as "one" or something), how could you make "oooooooooo" translate into "o"?
Would one have to write down manually various iterations of "o" (like, "oo" and "ooo" and "oooo"...etc.). Would it be similar to something like the ignore case method where O and o become the same? So "ooo..." and "o" end up as the same string.

Although probably overkill for this one use-case, it would be helpful to learn how to use regexes in the future. Java provides a regex library to use called Pattern. For example, the regex /o+ne/ would match any string "o...ne" with at least one "o".

using regex:
public static String getSingleCharacter(String input){
if(input == null || input.length() == 0) return null;
if(input.length() == 1) return input;
if(!input.toLowerCase().matches("^\\w*?(\\w)(?!\\1|$)\\w*$")){
return Character.toString(input.toLowerCase().charAt(0));
}
return null;
}
if the method returns null then the characters are not all the same, else it will return that single char represented as a string.

Use the regular expression /(.)\1+/ and String#replaceAll() to match runs of two or more of the same character and then replace the match with the value of the first match group identified with $1 as follows:
public static String squeeze(String input) {
return input.replaceAll("(.)\\1+", "$1");
}
String result = squeeze("aaaaa bbbbbbb cc d");
assert(result.equals("a b c d"));

public string condense(String input) {
if(input.length >= 3) {
for(int i=0; i< input.length-2; i++){
if(input.substring(i,i+1) != input.substring(i+1,i+2)){
return input;
}
}
}
return input.substring(0,1);
}
This checks if the string is 3 characters or longer, and if so it loops through the entire string. If every character in the string is the same, then it returns a condensed version of the string.

Compare String to a Regex

I was asked in a job interview to write a function that gets a string r representing a regex, and an input string s, and to say whether the two match.
The regex may contain the following symbols:
a-z
. match any single character
* look at the previous character at the regex, and match it zero or more times. (the regexes re legal so the sequence .* can't appear in the regex)
The input string may contain only a-z
Using automata wasn't practical since I had to write the code to implement it in 30 minutes, and attempting to create an automata out of the regex would be out of scope (the interviewer said so too).
Also - due to the way * was defined, using the build in methods in java to check whether the regex matches the string aren't optional too (I assume).
I solved it in the following manner:
public static boolean isMatch(String regex, String s){
if(regex.length()==0 && s.length()==0)
return true;
int j=0,i=0;
boolean match=false;
for(;i<regex.length() && j<s.length() && !match;i++)
{
if(regex.charAt(i)!='.' && regex.charAt(i)!='*')
{
if(regex.charAt(i)!=s.charAt(j))
return false;
else
j++;
}
else if(regex.charAt(i)=='.')
{
j++;
}
else
{
char n=regex.charAt(i-1);
int mone= countChar(s, n, j);
for(int k=0;k<=mone && !match;k++)
{
match=isMatch(regex.substring(i+1),s.substring(j+k));
if(match)
return true;
}
}
}
return match || (j==s.length() && i==regex.length());
}
It works but the complexity is O(n!) when n is the length of the regex.
I would like to know whether there is another, more efficient way to solve this.

Regex to check if a single quote is preceeded by another single quote

I would like to write a regex to validate if a single quote is preceeded by another single quote.
Valid strings:
azerty''uiop
aze''rty''uiop
''azertyuiop
azerty''uiop''
azerty ''uiop''
azerty''''uiop
azerty''''uiop''''
Invalid strings:
azerty'uiop
aze'rty'uiop
'azertyuiop
azerty'uiop'
azerty 'uiop'
azerty'''uiop

It can be done in one line:
inputString.matches("(?:[^']|'')*+");
The regex simply means, the string can contain 0 or more of
Non-quote character [^']
OR
A pair of consecutive quotes ''
I used possessive version (*+) of 0 or more quantifier (*). Since it would be lengthy to explain what possessive quantifier means, I will refer you to here to learn about it. Simply put, it is an optimization.

No need for a regex, just use .replace() to replace all sequences of two single quotes by nothing, then test whether you still find a single quote; if yes, the string is invalid:
if (input.replace("''", "").indexOf('\'') != -1)
// Not valid!
If you also want to consider that strings with no single quotes are valid, you'll have to create a temporary variable:
public boolean isValid(final String input)
{
final String s = input.replace("''", "");
return s.equals(input) ? true : s.indexOf('\'') == -1;
}

Do you want a very fast solution? Try the next:
public static boolean isValid(String str) {
char[] chars = str.toCharArray();
int found = 0;
for (int i = 0; i < chars.length; i++) {
char c = chars[i];
if (c == '\'') {
found++;
} else {
if (found > 0 && found % 2 != 0) {
return false;
}
found = 0;
}
}
if (found > 0 && found % 2 != 0) {
return false;
}
return true;
}

You can use the code bellow too:
str.matches("([^\']*(\'){2}[^\']*)+");
I think "([^\']*(\'){2}[^\']*)+" is easy to grasp, for the beginners. But this is not the best way to do this. It dies (runs into backtracking hell) when running for long input.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Checking if a string only contains certain characters - java

Since you require the barcode to be exactly 32 characters long and consist only of the : and | characters, you should use a combination of length and regex checking: boolean isCorrect = barCode.matches( "[\\|\\:]*" ); if(isCorrect && barCode.length() == 32) { //true case } else { //false case }

boolean isBarCode = barCode.matches( "[\\|\\:]*" );

Related

Regex to consolidate multiple rules

Java regex require at least one letter and one digit. Also allow any special characters

How to make multiple inputs of a single character register as one character?

Compare String to a Regex

Regex to check if a single quote is preceeded by another single quote

Categories

Resources