How to check if only certain characters appear in a String? [duplicate] - java

This question already has answers here:
Verify if String is hexadecimal
(8 answers)
Closed 8 years ago.
I need user input for a hexadecimal number so as long as their input contains the characters A-F or 0-9 it won't re-prompt them.
This is what I have which runs as long as the inputed string contains A-F and or 0-9, it still runs if you add on other characters which I don't want.
do {
System.out.print("Enter a # in hex: ");
inputHexNum = keyboard.next();
} while(!(inputHexNum.toUpperCase().matches(".*[A-F0-9].*")));

Could you not change your regex to be [A-F0-9]+?
So your code would look like the following:
do {
System.out.print("Enter a # in hex: ");
inputHexNum = keyboard.next();
} while(!(inputHexNum.toUpperCase().matches("[A-F0-9]+")));
As I understand the question, the problem with your current regex is that it allows any character to occur zero or more times, followed by a hex character, followed by any old character zero or more times again. This restricts the entire input to only containing at least one character that consists of the letters A-F (uppercase) and the digits 0-9.

Your regular expression probably doesn't do what you want. .* matches anything at all (empty string up to any number of arbitrary characters). Then you expect a single hex character followed again by anything.
So these would be valid inputs:
--0--
a
JFK
You should either say "I want a string which contains only valid hex digits. Then your condition would be:
while(!(inputHexNum.toUpperCase().matches("[A-F0-9]+")));
or you can check for any illegal characters with the pattern [^A-F0-9]. In this case, you'd need to create a Matcher yourself:
Pattern illegalCharacters = Pattern.compile("[^A-F0-9]");
Matcher matcher;
do {
...
matcher = illegalCharacters.matches(inputHexNum.toUpperCase());
} while( matcher.find() );

The regular expression that you are using matches every string that contains at least one hex digit. Judging from the first paragraph of the question this is exactly what you want. This is because "." matches any character (but possibly not linebreaks), so ".*" matches any (possibly empty) sequence of characters. Thus the regex ".*[A-F0-9].*" means "first, some arbitrary characters, then a hex digit, then some more characters". But from the second paragraph of the question it looks like you want to use the regex "[A-F0-9]+" which means "some hex digits (but at least one, and nothing else)". I assume you are confused about what needs to be done, but actually want the second.

Related

Having difficulty understanding Java regex interpretation [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 3 years ago.
Can someone help me with the following Java regex expression? I've done some research but I'm having a hard time putting everything together.
The regex:
"^-?\\d+$"
My understandning of what each symbol does:
" = matches the beginning of the line
- = indicates a range
? = does not occur or occurs once
\\d = matches the digits
+ = matches one or more of the previous thing.
$ = matches end of the line
Is the regex saying it only want matches that start or end with digits? But where do - and ? come in?
- only indicates a range if it's within a character class (i.e. square brackets []). Otherwise, it's a normal character like any other. With that in mind, this regex matches the following examples:
"-2"
"3"
"-700"
"436"
That is, a positive or negative integer: at least one digit, optionally preceded by a minus sign.
Some regex is composed, as you have now, the correct way to read your regex is :
^ start of word
-? optional minus character
\\d+ one or more digits
$ end of word
This regex match any positive or negative numbers, like 0, -15, 558, -19663, ...
Fore details check this good post Reference - What does this regex mean?
"^-?\\d+$" is not a regex, it's a Java string literal.
Once the compiler has parsed the string literal, the string value is ^-?\d+$, which is a regex matching like this:
^ Matches beginning of input
- Matches a minus sign
? Makes previous match (minus sign) optional
\d Matches a digit (0-9)
+ Makes previous match (digit) match repeatedly (1 or more times)
$ Matches end of input
All-in-all, the regex matches a positive or negative integer number of unlimited length.
Note: A - only denotes a range when inside a [] character class, e.g. [4-7] is the range of characters between '4' and '7', while [3-] and [-3] are not ranges since the start/end value is missing, so they both just match a 3 or - character.

Mask all SSN with only partial Mask from a file with multiple SSNs

Start by disclaiming that I am horrible with Regular expressions. I want to find every instance of a Social security number in a string and mask all but the dashes (-) and the last 4 of the SSN.
Example
String someStrWithSSN = "This is an SSN,123-31-4321, and here is another 987-65-8765";
Pattern formattedPattern = Pattern.compile("^\\d{9}|^\\d{3}-\\d{2}-\\d{4}$");
Matcher formattedMatcher = formattedPattern.matcher(someStrWithSSN);
while (formattedMatcher.find()) {
// Here is my first issue. not finding the pattern
}
// my next issue is that I need to my String should look like this
// "This is an SSN,XXX-XX-4321, and here is another XXX-XX-8765"
Expected results are to find each SSN and replace. The code above should produce the string, ""This is an SSN,XXX-XX-4321, and here is another XXX-XX-8765"
You can simplify this, by doing something like the following:
String initial = "This is an SSN,123-31-4321, and here is another 987-65-8765";
String processed = initial.replaceAll("\\d{3}\\-\\d{2}(?=\\-\\d{4})","XXX-XX");
System.out.println(initial);
System.out.println(processed);
Output:
This is an SSN,123-31-4321, and here is another 987-65-8765
This is an SSN,XXX-XX-4321, and here is another XXX-XX-8765
The regex \d{3}\-\d{2}(?=\-\d{4}) captures three digits followed by two digits, separated by a dash (and then followed by a dash and 4 digits, non-capturing). Using replaceAll with this regex will then create the desired masking effect.
Edit:
If you also want 9 consecutive digits to be targeted by this replacement, you can do the following:
String initial = "This is an SSN,123-31-4321, and here is another 987658765";
String processed = initial.replaceAll("\\d{3}\\-\\d{2}(?=\\-\\d{4})","XXX-XX")
.replaceAll("\\d{5}(?=\\d{4})","XXXXX");
System.out.println(initial);
System.out.println(processed);
Output:
This is an SSN,123-31-4321, and here is another 987658765
This is an SSN,XXX-XX-4321, and here is another XXXXX8765
The regex \d{5}(?=\d{4}) captures five digits (followed by 4 digits, non-capturing). Using a second call of replaceAll will target these sequences with the appropriate replacement.
Edit:
Here's a more robust version of the previous regex, and a longer demonstration of how the new regex works:
String initial = "123-45-6789 is a SSN that starts at the beginning of the string,
and still matches. This is an SSN, 123-31-4321, and here is another 987658765. These
have 10+ digits, so they don't match: 123-31-43214, and 98765876545.
This (123-31-4321-blah) has 9 digits, but is followed by a dash, so it doesn't match.
-123-31-4321 is preceded by a dash, so it doesn't match as well. :123-31-4321 is
preceded by a non-colon/digit, so it does match. Here's a 4-2-4 non-SSN that would've
tricked the initial regex: 1234-56-7890. Here's two SSNs in parentheses: (777777777)
(777-77-7777), and here's four invalid SSNs in parentheses: (7777777778) (777-77-77778)
(777-778-7777) (7778-77-7777). At the end of the string is a matching SSN:
998-76-4321";
String processed = initial.replaceAll("(?<=^|[^-\\d])\\d{3}\\-\\d{2}(?=\\-\\d{4}([^-\\d]|$))","XXX-XX")
.replaceAll("(?<=^|[^-\\d])\\d{5}(?=\\d{4}($|\\D))","XXXXX");
System.out.println(initial);
System.out.println(processed);
Output:
123-45-6789 is a SSN that starts at the beginning of the string, and still matches. This is an SSN, 123-31-4321, and here is another 987658765. These have 10+ digits, so they don't match: 123-31-43214, and 98765876545. This (123-31-4321-blah) has 9 digits, but is followed by a dash, so it doesn't match. -123-31-4321 is preceded by a dash, so it doesn't match as well. :123-31-4321 is preceded by a non-colon/digit, so it does match. Here's a 4-2-4 non-SSN that would've tricked the initial regex: 1234-56-7890. Here's two SSNs in parentheses: (777777777) (777-77-7777), and here's four invalid SSNs in parentheses: (7777777778)(777-77-77778) (777-778-7777) (7778-77-7777). At the end of the string is a matching SSN: 998-76-4321
XXX-XX-6789 is a SSN that starts at the beginning of the string, and still matches. This is an SSN, XXX-XX-4321, and here is another XXXXX8765. These have 10+ digits, so they don't match: 123-31-43214, and 98765876545. This (123-31-4321-blah) has 9 digits, but is followed by a dash, so it doesn't match. -123-31-4321 is preceded by a dash, so it doesn't match as well. :XXX-XX-4321 is preceded by a non-colon/digit, so it does match. Here's a 4-2-4 non-SSN that would've tricked the initial regex: 1234-56-7890. Here's two SSNs in parentheses: (XXXXX7777) (XXX-XX-7777), and here's four invalid SSNs in parentheses: (7777777778)(777-77-77778) (777-778-7777) (7778-77-7777). At the end of the string is a matching SSN: XXX-XX-4321

Java regular expression to check if there is at least one letter

I've found lot of variations on this subject on both SO and web, but most (if not all) ask for at least one letter and one digit. I need to have at least one letter.
I've tried but I haven't make it right, what I need is that String contain only letters, letters + numbers (any order), dashes and spaces are allowed but not at the beginning or the end of the string. Here is how it looks like right now:
protected static final String PATTERN = "[\u00C0-\u017Fa-zA-Z0-9']+([- ][\u00C0-\u017Fa-zA-Z0-9']+)*";
public static void main(String[] args) {
String name;
//name = "Street"; // allowed
//name = "Some-Street"; // allowed
//name = "Street "; // not allowed
//name = " Street"; // not allowed
//name = "Street-"; // not allowed
//name = "-Street"; // not allowed
//name = "Street"; // allowed
//name = "1 st street"; // allowed
//name = "street 5"; // allowed
name = "111"; // NOT allowed
if (!Pattern.matches(PATTERN, name)) {
System.out.println("ERROR!");
} else System.out.println("OK!");
}
}
How do I add check if there is at least one character?
No matter if it is at the beginning or end, or if there is space or dash between it and numbers. There just have to be at least one character.
You can use this regex for your problem:
^(?=.*\pL)[\pL\pN]+(?:[ -]+[\pL\pN]+)*$
RegEx Demo
For Java use:
final String regex = "^(?=.*\\pL)[\\pL\\pN]+(?:[ -]+[\\pL\\pN]+)*$";
RegEx Breakup:
^: Start
(?=.*\pL): Using a lookahead make sure we have at least one unicode letter somewhere
[\pL\pN]+: Match one or more unicode letter or unicode digit
(?:: Non-capturing group start
[ -]+: Match one or more space or hyphen
[\pL\pN]+: Match one or more unicode letter or unicode digit
)*: Non-capturing group end. * means zero or more of this group.
$: End
If I understand correctly, and according to what you've presented, you have the following conditions:
At least 1 letter
Can contain digits (but only if the previous condition is met)
Dashes and spaces are allowed only if they are not at the beginning or end of the string
Based on these conditions, the following regex will work:
^(?![ -]|\d+$)[[:alnum:] -]+(?<![ -])$
To see this regex in use, click this link.
This regex works as follows:
Ensure the string doesn't begin with hyphen - or space
Ensure the string isn't composed of only digits
Ensure the string contains between one and unlimited alphanumeric characters
Ensure the string doesn't end with hyphen - or space
This will give you the following matches
Street
Some-Street
Street
1 st street
street 5
The regex will fail to match the following strings (as per your examples)
Street
Street
Street-
-Street
111
Edit
Negative lookbehinds can sometimes cause issues in certain languages (like java).
Below is an adapted version of my previous regex that uses a negative lookahead instead of a negative lookbehind to ensure that the string doesn't end with hyphen - or space .
^(?![ -]|\d+$)(?:(?![ -]$)[\pL\pN -])+$
You can see this regex in use here
Following regex does the job:
(?=.*[[:alpha:]])[[:alnum:]]{1}[[:alnum:] -]*[[:alnum:]]{1}
(?=.*[[:alpha:]]) part guarantees that alpha character [A-Za-z]
exists inside word.
[[:alnum:]]{1} part guarantees that string starts with alphanumeric
character [A-Za-z0-9]
[[:alnum:] -]* alphanumeric characters, space and dash characher
might exist here.
[[:alnum:]]{1} part guarantees that string ends with alphanumeric
character [A-Za-z0-9]
To see it live https://regex101.com/r/V0lesF/1

Regex in java to validate a phone number does not work

I would like to write a regex which allows to validate a phone number which can be written as follows: 237 698888888 or +237 658888888 or 67883888 ..., in fact the phone number must respect the following condition (+237|237)'Space'(6|2)(5|8|2|3|9|7|6) [0-9] {7}
If the user purposefull to enter a number with prefix the prefix must be 237 or +237 in the case otherwise he decides to enter a number without prefix in this case he must enter a number with 9 digits the first digit must be 6 or 2, the second digit must be between 2,3,5,6,7,8 and 9; And the 7 digits remaining to the choice ie [0-9] {7}. Here is my java code for:
String regex = "(\\+237|237)\" \"(6|2)(2|3|[5-9])[0-9]{7}";
String sPhoneNumber = "237 278889999";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(sPhoneNumber);
if (matcher.matches()) {
Log.e("|==FILTER_NUM==>>","Phone Number Valid");
}
else
{
Log.e("|==FILTER_NUM==>>","Phone Number must be in the form XXX XXXXXXXX");
}
Returns this
E/|==FILTER_NUM==>>: Phone Number must be in the form XXX XXXXXXXX
Please check my code and tell me what's wrong
excuse me for my English :)
Your regex
(\\+237|237)\" \"(6|2)(2|3|[5-9])[0-9]{7}
You are using a space in it. use \\s instead to detect one space.
Also you can simplify this
\\+237|237
To
\\+?(237)
The final regex will look like
(\\+?(237))\\s(6|2)(2|3|[5-9])[0-9]{7}
Your regular expression is searching for literal quote marks ("), which is causing the match to fail. Also, since the prefix is optional, you need to indicate this by following the prefix part of the expression with ?.
The following regular expression should match all your sample phone numbers:
String regex = "(?:\\+?237 )?[26][235-9]\\d{7}";
(\+237|237) [62][^014][0-9]{7} is the expression for more details you can refer javaRegularExpressions. This is the online test of the expression above code test.
for space just use space it will match space too.

Replace last word in a string if it is 2 characters long using regex

I am trying to replace last word of a string if it is 2 characters long using regex. I used [a-zA-Z]{2}$ but it is finding last 2 characters of string. I don't want to replace the last word if it is not exactly 2 characters long, how can I do it?
You need to match a word boundary (\b) before the two letters:
\b[a-zA-Z]{2}$
This will match any two Latin letters that appear at the end of a string, as long as they are not preceded by a 'word' character (which is a Latin letter, digit, or underscore).
In case you want to replace the word even if it is preceded by a digit or underscore, you might want to use a lookbehind assertion, like this:
(?<![a-zA-Z])[a-zA-Z]{2}$
\\b\\w\\w\\b$ (regex in java flavor)
should work as well
Edit: in fact \\b\\w\\w$ should be enough. (or \b\w\w$ in non-java flavor.. see demo link)
You could also use:
[^\p{Alpha}]\p{Alpha}{2}$
Use Alnum instead if digits count as words. This does, however, fail if the entire string is only two characters long.

Categories