I have never done regex before, and I have seen they are very useful for working with strings. I saw a few tutorials (for example) but I still cannot understand how to make a simple Java regex check for hexadecimal characters in a string.
The user will input in the text box something like: 0123456789ABCDEF and I would like to know that the input was correct otherwise if something like XTYSPG456789ABCDEF when return false.
Is it possible to do that with a regex or did I misunderstand how they work?
Yes, you can do that with a regular expression:
^[0-9A-F]+$
Explanation:
^ Start of line.
[0-9A-F] Character class: Any character in 0 to 9, or in A to F.
+ Quantifier: One or more of the above.
$ End of line.
To use this regular expression in Java you can for example call the matches method on a String:
boolean isHex = s.matches("[0-9A-F]+");
Note that matches finds only an exact match so you don't need the start and end of line anchors in this case. See it working online: ideone
You may also want to allow both upper and lowercase A-F, in which case you can use this regular expression:
^[0-9A-Fa-f]+$
May be you want to use the POSIX character class \p{XDigit}, so:
^\p{XDigit}+$
Additionally, if you plan to use the regular expression very often, it is recommended to use a constant in order to avoid recompile it each time, e.g.:
private static final Pattern REGEX_PATTERN =
Pattern.compile("^\\p{XDigit}+$");
public static void main(String[] args) {
String input = "0123456789ABCDEF";
System.out.println(
REGEX_PATTERN.matcher(input).matches()
); // prints "true"
}
Actually, the given answer is not totally correct. The problem arises because the numbers 0-9 are also decimal values. PART of what you have to do is to test for 00-99 instead of just 0-9 to ensure that the lower values are not decimal numbers. Like so:
^([0-9A-Fa-f]{2})+$
To say these have to come in pairs! Otherwise - the string is something else! :-)
Example:
(Pick one)
var a = "1e5";
var a = "10";
var a = "314159265";
If I used the accepted answer in a regular expression it would return TRUE.
var re1 = new RegExp( /^[0-9A-Fa-f]+$/ );
var re2 = new RegExp( /^([0-9A-Fa-f]{2})+$/ );
if( re1.test(a) ){ alert("#1 = This is a hex value!"); }
if( re2.test(a) ){ alert("#2 = This IS a hex string!"); }
else { alert("#2 = This is NOT a hex string!"); }
Note that the "10" returns TRUE in both cases. If an incoming string only has 0-9 you can NOT tell, easily if it is a hex value or a decimal value UNLESS there is a missing zero in front of off length strings (hex values always come in pairs - ie - Low byte/high byte). But values like "34" are both perfectly valid decimal OR hexadecimal numbers. They just mean two different things.
Also note that "3.14159265" is not a hex value no matter which test you do because of the period. But with the addition of the "{2}" you at least ensure it really is a hex string rather than something that LOOKS like a hex string.
Related
I am new to regular expressions. I want to search for NUMBER(19, 4) and the method should return the value(in this case 19,4). But I always get 0 as result !
int length =0;
length = patternLength(datatype,"^NUMBER\\((\\d+)\\,\\s*\\)$","NUMBER");
private static double patternLengthD(String datatype, String patternString, String startsWith) {
double length=0;
if (datatype.startsWith(startsWith)) {
Pattern patternA = Pattern.compile(patternString);
Matcher matcherA = patternA.matcher(datatype);
if (matcherA.find()) {
length = Double.parseDouble(matcherA.group(1));
}
}
return length;
}
You are missing the matching of digits after the comma.
You also don't need to escape the ,.
Use this:
"^NUMBER\\((\\d+),\\s*(\\d+)\\)$"
This will give you the first number in group(1) and the second number in group(2).
It is however fairly strict on spaces, so you can be more lenient and match on values like " NUMBER ( 19 , 4 ) " by using this:
"^\\s*NUMBER\\s*\\(\\s*(\\d+)\\s*,\\s*(\\d+)\\s*\\)\\s*$"
In that case you'll have to drop your startsWith and just use the regex directly. Also, you can remove the anchors (^$) if you change find() to matches().
Since NUMBER(19) is usually allowed too. You can make the second value optional:
"\\s*NUMBER\\s*\\(\\s*(\\d+)\\s*(?:,\\s*(\\d+)\\s*)?\\)\\s*"
group(2) will then return null if the second number is not given.
See regex101 for demo.
Note that your code doesn't compile.
Your method returns a double, but length is an int.
Although 19,4 looks like a floating point number, it is not, and representing it as such is wrong.
You should store the two values separately.
I need to validate the EditText value using Regex.
The condition:
User can enter unsigned integer
User can enter floating point value.
I have achieved this using two different Pattern but I have no idea to how to check both in single Regex.
I used this following Regex
public boolean validFloatForString(String value) {
String decimalPattern = "([0-9]*)\\.([0-9]*)";
String integerPattern = "([0-9]*)";
boolean match = Pattern.matches(decimalPattern, value)
|| Pattern.matches(integerPattern, value);
System.out.println(match); // if true then decimal else not
return match;
}
Your pattern String decimalPattern = "([0-9]*)\\.([0-9]*)" will match ., while integerPattern will always match. That's because * means 0 or more.
I would use something like
String pattern = "^([0-9]+(?:\\.[0-9]*)?)";
Which matches unsigned integer and floats.
Edit 1
To match also unsigned floats, beginning with .
String pattern = "\\b([0-9]+(?:\\.[0-9]*)?|\\.[0-9]+)\\b";
I also substitute the ^, which means beginning of the string to match, with word boundary \\b.
If you are on Android you could use the embedded validator for this and e.g. set the InputType:
EditText text = new EditText(this);
text.setInputType(InputType.TYPE_NUMBER_FLAG_DECIMAL);
You could use something like so: ^[0-9]+(\\.[0-9]+)?$.
The above will make sure that the string is made up from one or more digits which is optionally followed by a decimal point and one or more digits.
In Java for String class there is a method called matches, how to use this method to check if my string is having only digits using regular expression. I tried with below examples, but both of them returned me false as result.
String regex = "[0-9]";
String data = "23343453";
System.out.println(data.matches(regex));
String regex = "^[0-9]";
String data = "23343453";
System.out.println(data.matches(regex));
Try
String regex = "[0-9]+";
or
String regex = "\\d+";
As per Java regular expressions, the + means "one or more times" and \d means "a digit".
Note: the "double backslash" is an escape sequence to get a single backslash - therefore, \\d in a java String gives you the actual result: \d
References:
Java Regular Expressions
Java Character Escape Sequences
Edit: due to some confusion in other answers, I am writing a test case and will explain some more things in detail.
Firstly, if you are in doubt about the correctness of this solution (or others), please run this test case:
String regex = "\\d+";
// positive test cases, should all be "true"
System.out.println("1".matches(regex));
System.out.println("12345".matches(regex));
System.out.println("123456789".matches(regex));
// negative test cases, should all be "false"
System.out.println("".matches(regex));
System.out.println("foo".matches(regex));
System.out.println("aa123bb".matches(regex));
Question 1:
Isn't it necessary to add ^ and $ to the regex, so it won't match "aa123bb" ?
No. In java, the matches method (which was specified in the question) matches a complete string, not fragments. In other words, it is not necessary to use ^\\d+$ (even though it is also correct). Please see the last negative test case.
Please note that if you use an online "regex checker" then this may behave differently. To match fragments of a string in Java, you can use the find method instead, described in detail here:
Difference between matches() and find() in Java Regex
Question 2:
Won't this regex also match the empty string, "" ?*
No. A regex \\d* would match the empty string, but \\d+ does not. The star * means zero or more, whereas the plus + means one or more. Please see the first negative test case.
Question 3
Isn't it faster to compile a regex Pattern?
Yes. It is indeed faster to compile a regex Pattern once, rather than on every invocation of matches, and so if performance implications are important then a Pattern can be compiled and used like this:
Pattern pattern = Pattern.compile(regex);
System.out.println(pattern.matcher("1").matches());
System.out.println(pattern.matcher("12345").matches());
System.out.println(pattern.matcher("123456789").matches());
You can also use NumberUtil.isNumber(String str) from Apache Commons
Using regular expressions is costly in terms of performance. Trying to parse string as a long value is inefficient and unreliable, and may be not what you need.
What I suggest is to simply check if each character is a digit, what can be efficiently done using Java 8 lambda expressions:
boolean isNumeric = someString.chars().allMatch(x -> Character.isDigit(x));
One more solution, that hasn't been posted, yet:
String regex = "\\p{Digit}+"; // uses POSIX character class
You must allow for more than a digit (the + sign) as in:
String regex = "[0-9]+";
String data = "23343453";
System.out.println(data.matches(regex));
Long.parseLong(data)
and catch exception, it handles minus sign.
Although the number of digits is limited this actually creates a variable of the data which can be used, which is, I would imagine, the most common use-case.
We can use either Pattern.compile("[0-9]+.[0-9]+") or Pattern.compile("\\d+.\\d+"). They have the same meaning.
the pattern [0-9] means digit. The same as '\d'.
'+' means it appears more times.
'.' for integer or float.
Try following code:
import java.util.regex.Pattern;
public class PatternSample {
public boolean containNumbersOnly(String source){
boolean result = false;
Pattern pattern = Pattern.compile("[0-9]+.[0-9]+"); //correct pattern for both float and integer.
pattern = Pattern.compile("\\d+.\\d+"); //correct pattern for both float and integer.
result = pattern.matcher(source).matches();
if(result){
System.out.println("\"" + source + "\"" + " is a number");
}else
System.out.println("\"" + source + "\"" + " is a String");
return result;
}
public static void main(String[] args){
PatternSample obj = new PatternSample();
obj.containNumbersOnly("123456.a");
obj.containNumbersOnly("123456 ");
obj.containNumbersOnly("123456");
obj.containNumbersOnly("0123456.0");
obj.containNumbersOnly("0123456a.0");
}
}
Output:
"123456.a" is a String
"123456 " is a String
"123456" is a number
"0123456.0" is a number
"0123456a.0" is a String
According to Oracle's Java Documentation:
private static final Pattern NUMBER_PATTERN = Pattern.compile(
"[\\x00-\\x20]*[+-]?(NaN|Infinity|((((\\p{Digit}+)(\\.)?((\\p{Digit}+)?)" +
"([eE][+-]?(\\p{Digit}+))?)|(\\.((\\p{Digit}+))([eE][+-]?(\\p{Digit}+))?)|" +
"(((0[xX](\\p{XDigit}+)(\\.)?)|(0[xX](\\p{XDigit}+)?(\\.)(\\p{XDigit}+)))" +
"[pP][+-]?(\\p{Digit}+)))[fFdD]?))[\\x00-\\x20]*");
boolean isNumber(String s){
return NUMBER_PATTERN.matcher(s).matches()
}
Refer to org.apache.commons.lang3.StringUtils
public static boolean isNumeric(CharSequence cs) {
if (cs == null || cs.length() == 0) {
return false;
} else {
int sz = cs.length();
for(int i = 0; i < sz; ++i) {
if (!Character.isDigit(cs.charAt(i))) {
return false;
}
}
return true;
}
}
In Java for String class, there is a method called matches(). With help of this method you can validate the regex expression along with your string.
String regex = "^[\\d]{4}$";
String value = "1234";
System.out.println(data.matches(value));
The Explanation for the above regex expression is:-
^ - Indicates the start of the regex expression.
[] - Inside this you have to describe your own conditions.
\\\d - Only allows digits. You can use '\\d'or 0-9 inside the bracket both are same.
{4} - This condition allows exactly 4 digits. You can change the number according to your need.
$ - Indicates the end of the regex expression.
Note: You can remove the {4} and specify + which means one or more times, or * which means zero or more times, or ? which means once or none.
For more reference please go through this website: https://www.rexegg.com/regex-quickstart.html
Offical regex way
I would use this regex for integers:
^[-1-9]\d*$
This will also work in other programming languages because it's more specific and doesn't make any assumptions about how different programming languages may interpret or handle regex.
Also works in Java
\\d+
Questions regarding ^ and $
As #vikingsteve has pointed out in java, the matches method matches a complete string, not parts of a string. In other words, it is unnecessary to use ^\d+$ (even though it is the official way of regex).
Online regex checkers are more strict and therefore they will behave differently than how Java handles regex.
Try this part of code:
void containsOnlyNumbers(String str)
{
try {
Integer num = Integer.valueOf(str);
System.out.println("is a number");
} catch (NumberFormatException e) {
// TODO: handle exception
System.out.println("is not a number");
}
}
I am trying to check if a range of numbers exist in a string
Any more elegant way than this?
if (ccnumeric.contains("51")
|| ccnumeric.contains("52")
|| ccnumeric.contains("53")
|| ccnumeric.contains("54")
|| ccnumeric.contains("55"))
I can't think of any method that satisfies this as I am checking for an int range in a string.
You could use a simple regex: 5[12345]{1} to look for a 5 followed by exactly one 1,2,3,4 or 5:
String s = "55";
Pattern p = Pattern.compile("5[12345]{1}");
if (p.matcher(s).find()) {
System.out.println("Found");
} else {
System.out.println("Not Found");
}
Maybe you can try regular expressions.
For instance
Pattern.matches( ".*5[1-5].*", ccnumeric );
Edit:
To find all numbers in a string:
List<Integer> allMatches = new ArrayList<Integer>();
Matcher m = Pattern.compile("(\\d+)").matcher(ccnumeric);
while (m.find()) {
allMatches.add(Integer.parse(m.group()));
}
Your question is a bit vague, for instance how many numbers are in the string "541", 1(541), 3(5,4,1), 6(5,4,1,54,41,541)? Depending on how you define that your regexes will change considerably. If your answer is 1, the solution is pretty straightforward, a greedy search for all numbers whose length is the same as your min and less than or equal to your max(you can then parse the matches and filter based on the min and max)...If its one of the others then your solution is going to be a bit more complicated, you are basically going to have to create a sliding window to evaluate all the possible combinations in the string.
I think your question is phrased ambigously... Are you looking to check if the input strings contains a number within the range 51-55? (Your written text seems to imply this) In this case, the string "512 is a power of two" would not be a match, since it does not contain the number 51.
Or are you looking to check that the string contains the character sequence 51, 52, 53, 54 or 55? (Your code seems to imply this) In this case, the string "512 is a power of two" would be a match, since it contains the character sequence 51.
If the first option is the case, then you'd need something like:
Pattern p = Pattern.compile("(^|[^\\d])5[1-5]($|[^\\d])");
if (p.matcher(inputString).find()) {
//The inputString contains the number you seek
}
Essentially, what this is doing is looking for
a character sequence 51-55
preceeded by either the beginning of the string, or by a non-digit character
followed by either the end of the string, or by a non digit character
I have never done regex before, and I have seen they are very useful for working with strings. I saw a few tutorials (for example) but I still cannot understand how to make a simple Java regex check for hexadecimal characters in a string.
The user will input in the text box something like: 0123456789ABCDEF and I would like to know that the input was correct otherwise if something like XTYSPG456789ABCDEF when return false.
Is it possible to do that with a regex or did I misunderstand how they work?
Yes, you can do that with a regular expression:
^[0-9A-F]+$
Explanation:
^ Start of line.
[0-9A-F] Character class: Any character in 0 to 9, or in A to F.
+ Quantifier: One or more of the above.
$ End of line.
To use this regular expression in Java you can for example call the matches method on a String:
boolean isHex = s.matches("[0-9A-F]+");
Note that matches finds only an exact match so you don't need the start and end of line anchors in this case. See it working online: ideone
You may also want to allow both upper and lowercase A-F, in which case you can use this regular expression:
^[0-9A-Fa-f]+$
May be you want to use the POSIX character class \p{XDigit}, so:
^\p{XDigit}+$
Additionally, if you plan to use the regular expression very often, it is recommended to use a constant in order to avoid recompile it each time, e.g.:
private static final Pattern REGEX_PATTERN =
Pattern.compile("^\\p{XDigit}+$");
public static void main(String[] args) {
String input = "0123456789ABCDEF";
System.out.println(
REGEX_PATTERN.matcher(input).matches()
); // prints "true"
}
Actually, the given answer is not totally correct. The problem arises because the numbers 0-9 are also decimal values. PART of what you have to do is to test for 00-99 instead of just 0-9 to ensure that the lower values are not decimal numbers. Like so:
^([0-9A-Fa-f]{2})+$
To say these have to come in pairs! Otherwise - the string is something else! :-)
Example:
(Pick one)
var a = "1e5";
var a = "10";
var a = "314159265";
If I used the accepted answer in a regular expression it would return TRUE.
var re1 = new RegExp( /^[0-9A-Fa-f]+$/ );
var re2 = new RegExp( /^([0-9A-Fa-f]{2})+$/ );
if( re1.test(a) ){ alert("#1 = This is a hex value!"); }
if( re2.test(a) ){ alert("#2 = This IS a hex string!"); }
else { alert("#2 = This is NOT a hex string!"); }
Note that the "10" returns TRUE in both cases. If an incoming string only has 0-9 you can NOT tell, easily if it is a hex value or a decimal value UNLESS there is a missing zero in front of off length strings (hex values always come in pairs - ie - Low byte/high byte). But values like "34" are both perfectly valid decimal OR hexadecimal numbers. They just mean two different things.
Also note that "3.14159265" is not a hex value no matter which test you do because of the period. But with the addition of the "{2}" you at least ensure it really is a hex string rather than something that LOOKS like a hex string.