Utopian Identification Number - REGEX pattern matching - java

I'm trying to validate Utopian ID number using java regex classes, ie Pattern and Matcher.
The following are the conditions which needs to be satisfied,
The string must begin with between 0-3 (inclusive) lowercase alphabets.
Immediately following the letters, there must be a sequence of digits (0-9), The length of this segment must be between 2 and 8, both inclusive.
Immediately following the numbers, there must be atleast 3 uppercase letters.
Following is the code which I've written,
public class Solution{public static void main(String[] args) {
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
int ntc;
String[] str;
try {
ntc = Integer.parseInt(br.readLine());
str = new String[ntc];
for (int i = 0; i < ntc; i++)
str[i] = br.readLine();
for (int i = 0; i < ntc; i++)
if (validate(str[i]))
System.out.println("VALID");
else
System.out.println("INVALID");
} catch (Exception e) {
e.printStackTrace();
}
}
private static boolean validate(String str) {
Pattern pr = Pattern.compile("[a-z]{0,3}[0-9]{2,8}[A-Z]{3,}");
Matcher mr = pr.matcher(str);
return mr.find();
}}
The following is the input and its respective o/p
I/P:
3
n761512618TUKEFQROSWNFWFWEQEXKPWYYCRK
rRf99
198VLHJIYVEBODQCQEGYGECOGRMQPE
O/P:
VALID
INVALID
VALID
The first testcase is Invalid as it has nine numbers instead of maximum of eight. However it says Valid.
Is there anything wrong in the Regex pattern which I've written.?

Use start and end anchors in your regex in-order to do an exact string match.
Pattern pr = Pattern.compile("^[a-z]{0,3}[0-9]{2,8}[A-Z]{3,}$");
Without anchors, it would match from the middle of a string also.

Use Matcher.matches() rather than Matcher.find(), in order to match the regexp against the entire string:
private static boolean validate(String str) {
Pattern pr = Pattern.compile("[a-z]{0,3}[0-9]{2,8}[A-Z]{3,}");
Matcher mr = pr.matcher(str);
return mr.matches();
}
Also, since the pattern never changes, I would move it into a constant so that it won't be recompiled every time the method is called:
static final Pattern UTOPIAN_ID_PATTERN =
Pattern.compile("[a-z]{0,3}[0-9]{2,8}[A-Z]{3,}");
private static boolean validate(final String str) {
Matcher mr = UTOPIAN_ID_PATTERN.matcher(str);
return mr.matches();
}

Related

Is there a regex where if first expression is valid then check for next [duplicate]

I have several strings in the rough form:
[some text] [some number] [some more text]
I want to extract the text in [some number] using the Java Regex classes.
I know roughly what regular expression I want to use (though all suggestions are welcome). What I'm really interested in are the Java calls to take the regex string and use it on the source data to produce the value of [some number].
EDIT: I should add that I'm only interested in a single [some number] (basically, the first instance). The source strings are short and I'm not going to be looking for multiple occurrences of [some number].
Full example:
private static final Pattern p = Pattern.compile("^([a-zA-Z]+)([0-9]+)(.*)");
public static void main(String[] args) {
// create matcher for pattern p and given string
Matcher m = p.matcher("Testing123Testing");
// if an occurrence if a pattern was found in a given string...
if (m.find()) {
// ...then you can use group() methods.
System.out.println(m.group(0)); // whole matched expression
System.out.println(m.group(1)); // first expression from round brackets (Testing)
System.out.println(m.group(2)); // second one (123)
System.out.println(m.group(3)); // third one (Testing)
}
}
Since you're looking for the first number, you can use such regexp:
^\D+(\d+).*
and m.group(1) will return you the first number. Note that signed numbers can contain a minus sign:
^\D+(-?\d+).*
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Regex1 {
public static void main(String[]args) {
Pattern p = Pattern.compile("\\d+");
Matcher m = p.matcher("hello1234goodboy789very2345");
while(m.find()) {
System.out.println(m.group());
}
}
}
Output:
1234
789
2345
Allain basically has the java code, so you can use that. However, his expression only matches if your numbers are only preceded by a stream of word characters.
"(\\d+)"
should be able to find the first string of digits. You don't need to specify what's before it, if you're sure that it's going to be the first string of digits. Likewise, there is no use to specify what's after it, unless you want that. If you just want the number, and are sure that it will be the first string of one or more digits then that's all you need.
If you expect it to be offset by spaces, it will make it even more distinct to specify
"\\s+(\\d+)\\s+"
might be better.
If you need all three parts, this will do:
"(\\D+)(\\d+)(.*)"
EDIT The Expressions given by Allain and Jack suggest that you need to specify some subset of non-digits in order to capture digits. If you tell the regex engine you're looking for \d then it's going to ignore everything before the digits. If J or A's expression fits your pattern, then the whole match equals the input string. And there's no reason to specify it. It probably slows a clean match down, if it isn't totally ignored.
In addition to Pattern, the Java String class also has several methods that can work with regular expressions, in your case the code will be:
"ab123abc".replaceFirst("\\D*(\\d*).*", "$1")
where \\D is a non-digit character.
In Java 1.4 and up:
String input = "...";
Matcher matcher = Pattern.compile("[^0-9]+([0-9]+)[^0-9]+").matcher(input);
if (matcher.find()) {
String someNumberStr = matcher.group(1);
// if you need this to be an int:
int someNumberInt = Integer.parseInt(someNumberStr);
}
This function collect all matching sequences from string. In this example it takes all email addresses from string.
static final String EMAIL_PATTERN = "[_A-Za-z0-9-\\+]+(\\.[_A-Za-z0-9-]+)*#"
+ "[A-Za-z0-9-]+(\\.[A-Za-z0-9]+)*(\\.[A-Za-z]{2,})";
public List<String> getAllEmails(String message) {
List<String> result = null;
Matcher matcher = Pattern.compile(EMAIL_PATTERN).matcher(message);
if (matcher.find()) {
result = new ArrayList<String>();
result.add(matcher.group());
while (matcher.find()) {
result.add(matcher.group());
}
}
return result;
}
For message = "adf#gmail.com, <another#osiem.osiem>>>> lalala#aaa.pl" it will create List of 3 elements.
Try doing something like this:
Pattern p = Pattern.compile("^.+(\\d+).+");
Matcher m = p.matcher("Testing123Testing");
if (m.find()) {
System.out.println(m.group(1));
}
Simple Solution
// Regexplanation:
// ^ beginning of line
// \\D+ 1+ non-digit characters
// (\\d+) 1+ digit characters in a capture group
// .* 0+ any character
String regexStr = "^\\D+(\\d+).*";
// Compile the regex String into a Pattern
Pattern p = Pattern.compile(regexStr);
// Create a matcher with the input String
Matcher m = p.matcher(inputStr);
// If we find a match
if (m.find()) {
// Get the String from the first capture group
String someDigits = m.group(1);
// ...do something with someDigits
}
Solution in a Util Class
public class MyUtil {
private static Pattern pattern = Pattern.compile("^\\D+(\\d+).*");
private static Matcher matcher = pattern.matcher("");
// Assumptions: inputStr is a non-null String
public static String extractFirstNumber(String inputStr){
// Reset the matcher with a new input String
matcher.reset(inputStr);
// Check if there's a match
if(matcher.find()){
// Return the number (in the first capture group)
return matcher.group(1);
}else{
// Return some default value, if there is no match
return null;
}
}
}
...
// Use the util function and print out the result
String firstNum = MyUtil.extractFirstNumber("Testing4234Things");
System.out.println(firstNum);
Look you can do it using StringTokenizer
String str = "as:"+123+"as:"+234+"as:"+345;
StringTokenizer st = new StringTokenizer(str,"as:");
while(st.hasMoreTokens())
{
String k = st.nextToken(); // you will get first numeric data i.e 123
int kk = Integer.parseInt(k);
System.out.println("k string token in integer " + kk);
String k1 = st.nextToken(); // you will get second numeric data i.e 234
int kk1 = Integer.parseInt(k1);
System.out.println("new string k1 token in integer :" + kk1);
String k2 = st.nextToken(); // you will get third numeric data i.e 345
int kk2 = Integer.parseInt(k2);
System.out.println("k2 string token is in integer : " + kk2);
}
Since we are taking these numeric data into three different variables we can use this data anywhere in the code (for further use)
How about [^\\d]*([0-9]+[\\s]*[.,]{0,1}[\\s]*[0-9]*).* I think it would take care of numbers with fractional part.
I included white spaces and included , as possible separator.
I'm trying to get the numbers out of a string including floats and taking into account that the user might make a mistake and include white spaces while typing the number.
Sometimes you can use simple .split("REGEXP") method available in java.lang.String. For example:
String input = "first,second,third";
//To retrieve 'first'
input.split(",")[0]
//second
input.split(",")[1]
//third
input.split(",")[2]
if you are reading from file then this can help you
try{
InputStream inputStream = (InputStream) mnpMainBean.getUploadedBulk().getInputStream();
BufferedReader br = new BufferedReader(new InputStreamReader(inputStream));
String line;
//Ref:03
while ((line = br.readLine()) != null) {
if (line.matches("[A-Z],\\d,(\\d*,){2}(\\s*\\d*\\|\\d*:)+")) {
String[] splitRecord = line.split(",");
//do something
}
else{
br.close();
//error
return;
}
}
br.close();
}
}
catch (IOException ioExpception){
logger.logDebug("Exception " + ioExpception.getStackTrace());
}
Pattern p = Pattern.compile("(\\D+)(\\d+)(.*)");
Matcher m = p.matcher("this is your number:1234 thank you");
if (m.find()) {
String someNumberStr = m.group(2);
int someNumberInt = Integer.parseInt(someNumberStr);
}

Reject String If Contains Any Non-Alpha Numeric Character

I am writing a program and want the program to not loop and request another search pattern if the search pattern (word) contains any non alpha numeric characters.
I have setup a Boolean word to false and an if statement to change the Boolean to true if the word contains letters or numbers. Then another if statement to allow the program to execute if the Boolean is true.
My logic must be off because it is still executing through the search pattern if I simply enter "/". The search pattern cannot contain any non alpha numeric characters to include spaces. I am trying to use Regex to solve this problem.
Sample problematic output:
Please enter a search pattern: /
Line number 1
this.a 2test/austin
^
Line number 8
ra charity Charityis 4 times a day/a.a-A
^
Here is my applicable code:
while (again) {
boolean found = false;
System.out.printf("%n%s", "Please enter a search pattern: ", "%n");
String wordToSearch = input.next();
if (wordToSearch.equals("EINPUT")) {
System.out.printf("%s", "Bye!");
System.exit(0);
}
Pattern p = Pattern.compile("\\W*");
Matcher m = p.matcher(wordToSearch);
if (m.find())
found = true;
String data;
int lineCount = 1;
if (found = true) {
try (FileInputStream fis =
new FileInputStream(this.inputPath.getPath())) {
File file1 = this.inputPath;
byte[] buffer2 = new byte[fis.available()];
fis.read(buffer2);
data = new String(buffer2);
Scanner in = new Scanner(data).useDelimiter("\\\\|[^a-zA-z0-9]+");
while (in.hasNextLine()) {
String line = in.nextLine();
Pattern pattern = Pattern.compile("\\b" + wordToSearch + "\\b");
Matcher matcher = pattern.matcher(line);
if (matcher.find()) {
System.out.println("Line number " + lineCount);
String stringToFile = f.findWords(line, wordToSearch);
System.out.println();
}
lineCount++;
}
}
}
}
Stop reinventing the wheel.
Read this: Apache StringUtils,
Focus on isAlpha,
isAlphanumeric,
and isAlphanumericSpace
One of those is likely to provide the functionality you want.
Create a separate method to call the String you are searching through:
public boolean isAlphanumeric(String str)
{
char[] charArray = str.toCharArray();
for(char c:charArray)
{
if (!Character.isLetterOrDigit(c))
return false;
}
return true;
}
Then, add the following if statement to the above code prior to the second try statement.
if (isAlphanumeric(wordToSearch) == true)
Well since no one posted REGEX one, here you go:
package com.company;
public class Main {
public static void main(String[] args) {
String x = "ABCDEF123456";
String y = "ABC$DEF123456";
isValid(x);
isValid(y);
}
public static void isValid(String s){
if (s.matches("[A-Za-z0-9]*"))
System.out.println("String doesn't contain non alphanumeric characters !");
else
System.out.println("Invalid characters in string !");
}
}
Right now, what's happening is if the search pattern contains non alphanumeric characters, then do the loop. This is because found = true when the non alphanumeric characters are detected.
if(m.find())
found = true;
What it should be:
if(!m.find())
found = true;
It should be checking for the absence of nonalphanumeric characters.
Also, the boolean flag can just be simplified to:
boolean found = !m.find();
You don't need to use the if statement.

How to get a substring of a certain character followed by a number?

In Java, how would I get a substring of a certain character followed by a number?
The string looks like this:
To be, or not to be. (That is the question.) (243)
I want the substring up until the (243), where the number inside the parenthesis is always changing every time I call.
Use a regular expression:
newstr = str.replaceFirst("\(\d+\)", "");
What this means is to find a substring beginning with (, then any number of digits, and then the character ). Then replace the substring with the empty string, "".
Reference: java.lang.String.replaceFirst()
You could match it with a regex, and get the index of the regex. Then use that to get the index in the string.
An example of that is Can Java String.indexOf() handle a regular expression as a parameter?
Pattern pattern = Pattern.compile(patternStr);
Matcher matcher = pattern.matcher(inputStr);
if(matcher.find()){
System.out.println(matcher.start());//this will give you index
}
You can use String.replaceAll():
String s = "To be, or not to be. (That is the question.) (243)";
String newString = s.replaceAll("\\(\\d+\\).*", "");
I think you can actually just do something like:
mystring.substring(0,mystring.lastIndexOf"("))
assuming that the last thing on the line will be the number in parentheses.
You could use a for loop and add the characters before the number to a separate string
String sentence = "To be, or not to be. (That is the question.) (243)";
public static void main(String[] args) {
String subSentence = getSubsentence(sentence);
}
public String getSubsentence(String sentence) {
String subSentence = "";
boolean checkForNum = false;
for (int i = 0; i < sentence.length(); i++) {
if (checkForNum) {
if (isInteger(sentence.getSubstring(i, i+1))) return subSentence;
checkForNum = false;
} else {
if (sentence.getSubstring(i, i+1).equals("(")) checkForNum = true;
else subSentence += sentence.getSubstring(i, i+1);
}
}
return subSentence;
}
public boolean isInteger(String s) {
try {
Integer.parseInt(s);
} catch(NumberFormatException e) {
return false;
}
return true;
}
Using a regex this can be solved with.
public class RegExParser {
public String getTextPart(String s) {
String pattern = "^(\\D+)(\\s\\(\\d+\\))$";
String part = s.replaceAll(pattern, "$1");
return part;
}
}
Simple and performance is good.

How to determine where a regex failed to match using Java APIs

I have tests where I validate the output with a regex. When it fails it reports that output X did not match regex Y.
I would like to add some indication of where in the string the match failed. E.g. what is the farthest the matcher got in the string before backtracking. Matcher.hitEnd() is one case of what I'm looking for, but I want something more general.
Is this possible to do?
If a match fails, then Match.hitEnd() tells you whether a longer string could have matched. In addition, you can specify a region in the input sequence that will be searched to find a match. So if you have a string that cannot be matched, you can test its prefixes to see where the match fails:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class LastMatch {
private static int indexOfLastMatch(Pattern pattern, String input) {
Matcher matcher = pattern.matcher(input);
for (int i = input.length(); i > 0; --i) {
Matcher region = matcher.region(0, i);
if (region.matches() || region.hitEnd()) {
return i;
}
}
return 0;
}
public static void main(String[] args) {
Pattern pattern = Pattern.compile("[A-Z]+[0-9]+[a-z]+");
String[] samples = {
"*ABC",
"A1b*",
"AB12uv",
"AB12uv*",
"ABCDabc",
"ABC123X"
};
for (String sample : samples) {
int lastMatch = indexOfLastMatch(pattern, sample);
System.out.println(sample + ": last match at " + lastMatch);
}
}
}
The output of this class is:
*ABC: last match at 0
A1b*: last match at 3
AB12uv: last match at 6
AB12uv*: last match at 6
ABCDabc: last match at 4
ABC123X: last match at 6
You can take the string, and iterate over it, removing one more char from its end at every iteration, and then check for hitEnd():
int farthestPoint(Pattern pattern, String input) {
for (int i = input.length() - 1; i > 0; i--) {
Matcher matcher = pattern.matcher(input.substring(0, i));
if (!matcher.matches() && matcher.hitEnd()) {
return i;
}
}
return 0;
}
You could use a pair of replaceAll() calls to indicate the positive and negative matches of the input string. Let's say, for example, you want to validate a hex string; the following will indicate the valid and invalid characters of the input string.
String regex = "[0-9A-F]"
String input = "J900ZZAAFZ99X"
Pattern p = Pattern.compile(regex)
Matcher m = p.matcher(input)
String mask = m.replaceAll('+').replaceAll('[^+]', '-')
System.out.println(input)
System.out.println(mask)
This would print the following, with a + under valid characters and a - under invalid characters.
J900ZZAAFZ99X
-+++--+++-++-
If you want to do it outside of the code, I use rubular to test the regex expressions before sticking them in the code.

Check if a String contains a special character

How do you check if a String contains a special character like:
[,],{,},{,),*,|,:,>,
Pattern p = Pattern.compile("[^a-z0-9 ]", Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher("I am a string");
boolean b = m.find();
if (b)
System.out.println("There is a special character in my string");
If you want to have LETTERS, SPECIAL CHARACTERS and NUMBERS in your password with at least 8 digit, then use this code, it is working perfectly
public static boolean Password_Validation(String password)
{
if(password.length()>=8)
{
Pattern letter = Pattern.compile("[a-zA-z]");
Pattern digit = Pattern.compile("[0-9]");
Pattern special = Pattern.compile ("[!##$%&*()_+=|<>?{}\\[\\]~-]");
//Pattern eight = Pattern.compile (".{8}");
Matcher hasLetter = letter.matcher(password);
Matcher hasDigit = digit.matcher(password);
Matcher hasSpecial = special.matcher(password);
return hasLetter.find() && hasDigit.find() && hasSpecial.find();
}
else
return false;
}
You can use the following code to detect special character from string.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class DetectSpecial{
public int getSpecialCharacterCount(String s) {
if (s == null || s.trim().isEmpty()) {
System.out.println("Incorrect format of string");
return 0;
}
Pattern p = Pattern.compile("[^A-Za-z0-9]");
Matcher m = p.matcher(s);
// boolean b = m.matches();
boolean b = m.find();
if (b)
System.out.println("There is a special character in my string ");
else
System.out.println("There is no special char.");
return 0;
}
}
If it matches regex [a-zA-Z0-9 ]* then there is not special characters in it.
What do you exactly call "special character" ? If you mean something like "anything that is not alphanumeric" you can use org.apache.commons.lang.StringUtils class (methods IsAlpha/IsNumeric/IsWhitespace/IsAsciiPrintable).
If it is not so trivial, you can use a regex that defines the exact character list you accept and match the string against it.
This is tested in android 7.0 up to android 10.0 and it works
Use this code to check if string contains special character and numbers:
name = firstname.getText().toString(); //name is the variable that holds the string value
Pattern special= Pattern.compile("[^a-z0-9 ]", Pattern.CASE_INSENSITIVE);
Pattern number = Pattern.compile("[0-9]", Pattern.CASE_INSENSITIVE);
Matcher matcher = special.matcher(name);
Matcher matcherNumber = number.matcher(name);
boolean constainsSymbols = matcher.find();
boolean containsNumber = matcherNumber.find();
if(constainsSymbols){
//string contains special symbol/character
}
else if(containsNumber){
//string contains numbers
}
else{
//string doesn't contain special characters or numbers
}
All depends on exactly what you mean by "special". In a regex you can specify
\W to mean non-alpahnumeric
\p{Punct} to mean punctuation characters
I suspect that the latter is what you mean. But if not use a [] list to specify exactly what you want.
Have a look at the java.lang.Character class. It has some test methods and you may find one that fits your needs.
Examples: Character.isSpaceChar(c) or !Character.isJavaLetter(c)
This worked for me:
String s = "string";
if (Pattern.matches("[a-zA-Z]+", s)) {
System.out.println("clear");
} else {
System.out.println("buzz");
}
First you have to exhaustively identify the special characters that you want to check.
Then you can write a regular expression and use
public boolean matches(String regex)
//without using regular expression........
String specialCharacters=" !#$%&'()*+,-./:;<=>?#[]^_`{|}~0123456789";
String name="3_ saroj#";
String str2[]=name.split("");
for (int i=0;i<str2.length;i++)
{
if (specialCharacters.contains(str2[i]))
{
System.out.println("true");
//break;
}
else
System.out.println("false");
}
Pattern p = Pattern.compile("[\\p{Alpha}]*[\\p{Punct}][\\p{Alpha}]*");
Matcher m = p.matcher("Afsff%esfsf098");
boolean b = m.matches();
if (b == true)
System.out.println("There is a sp. character in my string");
else
System.out.println("There is no sp. char.");
//this is updated version of code that i posted
/*
The isValidName Method will check whether the name passed as argument should not contain-
1.null value or space
2.any special character
3.Digits (0-9)
Explanation---
Here str2 is String array variable which stores the the splited string of name that is passed as argument
The count variable will count the number of special character occurs
The method will return true if it satisfy all the condition
*/
public boolean isValidName(String name)
{
String specialCharacters=" !#$%&'()*+,-./:;<=>?#[]^_`{|}~0123456789";
String str2[]=name.split("");
int count=0;
for (int i=0;i<str2.length;i++)
{
if (specialCharacters.contains(str2[i]))
{
count++;
}
}
if (name!=null && count==0 )
{
return true;
}
else
{
return false;
}
}
Visit each character in the string to see if that character is in a blacklist of special characters; this is O(n*m).
The pseudo-code is:
for each char in string:
if char in blacklist:
...
The complexity can be slightly improved by sorting the blacklist so that you can early-exit each check. However, the string find function is probably native code, so this optimisation - which would be in Java byte-code - could well be slower.
in the line String str2[]=name.split(""); give an extra character in Array...
Let me explain by example
"Aditya".split("") would return [, A, d,i,t,y,a] You will have a extra character in your Array...
The "Aditya".split("") does not work as expected by saroj routray you will get an extra character in String => [, A, d,i,t,y,a].
I have modified it,see below code it work as expected
public static boolean isValidName(String inputString) {
String specialCharacters = " !#$%&'()*+,-./:;<=>?#[]^_`{|}~0123456789";
String[] strlCharactersArray = new String[inputString.length()];
for (int i = 0; i < inputString.length(); i++) {
strlCharactersArray[i] = Character
.toString(inputString.charAt(i));
}
//now strlCharactersArray[i]=[A, d, i, t, y, a]
int count = 0;
for (int i = 0; i < strlCharactersArray.length; i++) {
if (specialCharacters.contains( strlCharactersArray[i])) {
count++;
}
}
if (inputString != null && count == 0) {
return true;
} else {
return false;
}
}
Convert the string into char array with all the letters in lower case:
char c[] = str.toLowerCase().toCharArray();
Then you can use Character.isLetterOrDigit(c[index]) to find out which index has special characters.
Use java.util.regex.Pattern class's static method matches(regex, String obj)
regex : characters in lower and upper case & digits between 0-9
String obj : String object you want to check either it contain special character or not.
It returns boolean value true if only contain characters and numbers, otherwise returns boolean value false
Example.
String isin = "12GBIU34RT12";<br>
if(Pattern.matches("[a-zA-Z0-9]+", isin)<br>{<br>
System.out.println("Valid isin");<br>
}else{<br>
System.out.println("Invalid isin");<br>
}

Categories