I would like to implement a regular expression that return true if:
The string contain only number
The string does not contain only 0 ( like 0000)
For example:
1230456 => true
888822200000 => true
00000000 => false
fff => false
I started to implement this
private static final String ARTICLE_VALID_FORMAT = "\\d";
private static final Pattern ARTICLE_VALID_FORMAT_PATTERN = Pattern.compile(ARTICLE_VALID_FORMAT);
private boolean isArticleHasValidFormat(String article) {
return StringUtils.isNotBlank(article) && ARTICLE_VALID_FORMAT_PATTERN.matcher(article).matches();
}
Now, it returns true if the article has only number. but i would like to test also if it is not all 0.
How to do that?
Thanks
You can use:
private static final String ARTICLE_VALID_FORMAT = "[0-9]*?[1-9][0-9]*";
which means:
Match zero or more digits; the ? means to match as few as possible before moving onto the next part
then one digit that's not a zero
then zero or more digits
Or, as Joachim Sauer suggested in comments:
private static final String ARTICLE_VALID_FORMAT = "0*[1-9][0-9]*";
which means:
Match zero or more zeros
then one digit that's not a zero
then zero or more digits
If you wanted to do it without regex, you could use (among many other ways):
string.chars().allMatch(c -> c >= '0' && c <= '9')
&& string.chars().anyMatch(c -> c != '0')
The regex pattern \d*[1-9]\d* as given by #AndyTurner is a good way to do this. Another approach would be to try to parse the string input to a long, and then check that it is greater than zero:
private boolean isArticleHasValidFormat(String article) {
try {
if (Long.parseLong(article) > 0) return true;
}
catch (NumberFormatException e) {
}
return false;
}
This solution assumes that you are only concerned with finding positive numbers. If not, and you want to cater to negatives, then check num != 0 instead.
Try this condition.
(Integer.pasrseInt("0" + article.replaceAll("^[0-9]", "0")) != 0) ? true : false
the ["0" +] is to avoid NumberFormatException for empty string
You don't need to make a Pattern object. Just call matches function from the String class
article.matches("\\d*[1-9]\\d*");
It's the same regex as Andy Turner suggested.
Related
I got a problem and I think it is in comparing a char with a number.
String FindCountry = "BB";
Map<String, String> Cont = new HashMap <> ();
Cont.put("BA-BE", "Angola");
Cont.put("9X-92", "Trinidad & Tobago");
for ( String key : Cont.keySet()) {
if (key.charAt(0) == FindCountry.charAt(0) && FindCountry.charAt(1) >= key.charAt(1) && FindCountry.charAt(1) <= key.charAt(4)) {
System.out.println("Country: "+ Cont.get(key));
}
}
In this case the code print "Angola", but if
String FindCountry = "9Z"
it doesn't print anything. I am not sure I think the problem is in that it can't compare that is '2' greater than 'Z'. In that example, I got only two Cont.put(), but in my file, I got much more and a lot of them are not only with chars. I got a problem with them.
What is the smartest and best way to compare char with a number ? Actually, if I set a rule like "1" is greater than "Z" it will be okay because I need this way of greater: A-Z-9-0.
Thanks!
You can use a lookup "table", I used a String:
private static final String LOOKUP = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
And then compare the chars with indexOf(), but it seems messy and could probably be achieved more easily, I just can't come up with something easier at the moment:
String FindCountry = "9Z";
Map<String, String> Cont = new HashMap<>();
Cont.put("BA-BE", "Angola");
Cont.put("9X-92", "Trinidad & Tobago");
for (String key : Cont.keySet()) {
if (LOOKUP.indexOf(key.charAt(0)) == LOOKUP.indexOf(FindCountry.charAt(0)) &&
LOOKUP.indexOf(FindCountry.charAt(1)) >= LOOKUP.indexOf(key.charAt(1)) &&
LOOKUP.indexOf(FindCountry.charAt(1)) <= LOOKUP.indexOf(key.charAt(4))) {
System.out.println("Country: " + Cont.get(key));
}
}
If you only use the characters A-Z and 0-9, you could add a conversion method in between which will increase the values of the 0-9 characters so they'll be after A-Z:
int applyCharOrder(char c){
// If the character is a digit:
if(c < 58){
// Add 43 to put it after the 'Z' in terms of decimal unicode value:
return c + 43;
}
// If it's an uppercase letter instead: simply return it as is
return c;
}
Which can be used like this:
if(applyCharOrder(key.charAt(0)) == applyCharOrder(findCountry.charAt(0))
&& applyCharOrder(findCountry.charAt(1)) >= applyCharOrder(key.charAt(1))
&& applyCharOrder(findCountry.charAt(1)) <= applyCharOrder(key.charAt(4))){
System.out.println("Country: "+ cont.get(key));
}
Try it online.
Note: Here is a table with the decimal unicode values. Characters '0'-'9' will have the values 48-57 and 'A'-'Z' will have the values 65-90. So the < 58 is used to check if it's a digit-character, and the + 43 will increase the 48-57 to 91-100, putting their values above the 'A'-'Z' so your <= and >= checks will work as you'd want them to.
Alternatively, you could create a look-up String and use its index for the order:
int applyCharOrder(char c){
return "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789".indexOf(c);
}
Try it online.
PS: As mentioned in the first comment by #Stultuske, variables are usually in camelCase, so they aren't starting with an uppercase letter.
As the others stated in the comments, such mathematical comparison operations on characters are based on the actual ASCII values of each char. So I'd suggest you refactor your logic using the ASCII table as reference.
What can be an efficient way to determine if a given integer matches the result of an expression 5+4n. Say, I have a number 54 and I want to check if expression given above can result in this number. One way I can think of is to evaluate 5+4n up to 54 and check if the result matches the number of interest. But it will be an inefficient way to go when I'll have to check big numbers.
If we are assuming that x will always be greater than or equal to 0, x=5+4n can be rewritten as x=1+4(n+1).
This means that when x is divided by 4, there will be a remainder of 1. So you can simply do the below:
private boolean doesMatch(x number) {
return x%4==1;
}
if you want to evaluate the expression A+Bn, with A and B configurable, you should extract A and B first:
Pattern p = Pattern.compile("(\\-?\\d+)\\s*\\+([\\-\\+]?\\d+)\\s*n");
Matcher m = p.matcher(expression);
if (m.matches()) {
int A = Integer.valueOf(m.group(1));
int B = Integer.valueOf(m.group(2));
// Evaluate expression
...
}
and then simply check if x is a solution:
// Evaluate expression
if ((x-A) % B == 0) {
// Number is a solution
} else {
// Number is not a solution for the expression
}
I have written regex from numbers from 0 to 31. It shall not allow preceding zeros.
[0-2]\\d|/3[0-2]
But it also allows preceding zeros.
01 invalid
02 invalid
Can some tell me how to fix this.
You can use the following regex:
^(?:[0-9]|[12][0-9]|3[01])$
See demo
Your regex - [0-2]\\d|/3[0-2] - contains 2 alternatives: 1) [0-2]\\d matches a digit from 0-2 range first and then any 1 digit (with \\d), and 2) /3[0-2] matches /, then 3 and then 1 digit from 0-2 range. What is important is that without anchors (^ and $) this expression will match substrings in longer strings, and will match 01 in 010.
Since there has been some discussion about shorthand classes, here is a version with the shorthand class and here is also an example with matches() that requires full input to match and thus we do not need explicit anchors:
String pttrn = "(?:\\d|[12]\\d|3[01])";
System.out.println("31".matches(pttrn));
See demo
Note that the backslash should be doubled here.
You can try with the following pattern:
^(?:[12]?[0-9]|3[01])$
Just another non-Regex approach with data validation before attempting to convert a String to int. Here we are validating that the data is at least 1 character that is a digit, or the data is 2 characters that are digits and the first character is not a 0.
public static void main(String[] args) throws Exception {
List<String> data = new ArrayList() {{
add("01"); // Bad
add("1A"); // Bad
add("123"); // Bad
add("31"); // Good
add("-1"); // Bad
add("32"); // Bad
add("0"); // Good
add("15"); // Good
}};
for (String d : data) {
boolean valid = true;
if (d.isEmpty()) {
valid = false;
} else {
char firstChar = d.charAt(0);
if ((d.length() == 1 && Character.isDigit(firstChar)) ||
(d.length() == 2 &&
(Character.isDigit(firstChar) && firstChar != '0' &&
Character.isDigit(d.charAt(1))))) {
int myInt = Integer.parseInt(d);
valid = (0 <= myInt && myInt <= 31);
} else {
valid = false;
}
}
System.out.println(valid ? "Valid" : "Invalid");
}
}
Results:
Invalid
Invalid
Invalid
Valid
Invalid
Invalid
Valid
Valid
Another option:
\\b(?:[12]?\\d|3[12])\\b
Demo
This regex does not use none-capturing group:
^(\d|[12]\d|3[01])$
Explanation:
^ - start of line \d - single digit 0-9 or [12]\d - tens
and twenties or 3[01] - thirty and thirty one $ - line end
Java DEMO
It is harder to maintain code with regex in it: see When you should not use Regular Expressions
In order to make your code more maintainable and easier for other developers to jump into and support, maybe you could consider converting your String to an Integer and then testing the value?
if((!inputString.startsWith("0") && inputString.length() == 2) || inputString.length() == 1){
Integer myInt = Integer.parseInt(inputString);
if( 0 <= myInt && myInt <= 31){
//execute logic...
}
}
you could also easily break this out into a utility method that is very descriptive such as:
private boolean isBetween0And31Inclusive(String inputString){
try{
if((!inputString.startsWith("0") && inputString.length() == 2) || inputString.length() == 1){
Integer myInt = Integer.parseInt(inputString);
if(0 <= myInt && myInt <= 31){
return true;
}
}
return false;
}catch(NumberFormatException exception){
return false;
}
}
This Java code is giving me trouble:
String word = <Uses an input>
int y = 3;
char z;
do {
z = word.charAt(y);
if (z!='a' || z!='e' || z!='i' || z!='o' || z!='u')) {
for (int i = 0; i==y; i++) {
wordT = wordT + word.charAt(i);
} break;
}
} while(true);
I want to check if the third letter of word is a non-vowel, and if it is I want it to return the non-vowel and any characters preceding it. If it is a vowel, it checks the next letter in the string, if it's also a vowel then it checks the next one until it finds a non-vowel.
Example:
word = Jaemeas then wordT must = Jaem
Example 2:
word=Jaeoimus then wordT must =Jaeoim
The problem is with my if statement, I can't figure out how to make it check all the vowels in that one line.
Clean method to check for vowels:
public static boolean isVowel(char c) {
return "AEIOUaeiou".indexOf(c) != -1;
}
Your condition is flawed. Think about the simpler version
z != 'a' || z != 'e'
If z is 'a' then the second half will be true since z is not 'e' (i.e. the whole condition is true), and if z is 'e' then the first half will be true since z is not 'a' (again, whole condition true). Of course, if z is neither 'a' nor 'e' then both parts will be true. In other words, your condition will never be false!
You likely want &&s there instead:
z != 'a' && z != 'e' && ...
Or perhaps:
"aeiou".indexOf(z) < 0
How about an approach using regular expressions? If you use the proper pattern you can get the results from the Matcher object using groups. In the code sample below the call to m.group(1) should return you the string you're looking for as long as there's a pattern match.
String wordT = null;
Pattern patternOne = Pattern.compile("^([\\w]{2}[AEIOUaeiou]*[^AEIOUaeiou]{1}).*");
Matcher m = patternOne.matcher("Jaemeas");
if (m.matches()) {
wordT = m.group(1);
}
Just a little different approach that accomplishes the same goal.
Actually there are much more efficient ways to check it but since you've asked what is the problem with yours, I can tell that the problem is you have to change those OR operators with AND operators. With your if statement, it will always be true.
So in event anyone ever comes across this and wants a easy compare method that can be used in many scenarios.
Doesn't matter if it is UPPERCASE or lowercase. A-Z and a-z.
bool vowel = ((1 << letter) & 2130466) != 0;
This is the easiest way I could think of. I tested this in C++ and on a 64bit PC so results may differ but basically there's only 32 bits available in a "32 bit integer" as such bit 64 and bit 32 get removed and you are left with a value from 1 - 26 when performing the "<< letter".
If you don't understand how bits work sorry i'm not going go super in depth but the technique of
1 << N is the same thing as 2^N power or creating a power of two.
So when we do 1 << N & X we checking if X contains the power of two that creates our vowel is located in this value 2130466. If the result doesn't equal 0 then it was successfully a vowel.
This situation can apply to anything you use bits for and even values larger then 32 for an index will work in this case so long as the range of values is 0 to 31. So like the letters as mentioned before might be 65-90 or 97-122 but since but we keep remove 32 until we are left with a remainder ranging from 1-26. The remainder isn't how it actually works, but it gives you an idea of the process.
Something to keep in mind if you have no guarantee on the incoming letters it to check if the letter is below 'A' or above 'u'. As the results will always be false anyways.
For example teh following will return a false vowel positive. "!" exclamation point is value 33 and it will provide the same bit value as 'A' or 'a' would.
For starters, you are checking if the letter is "not a" OR "not e" OR "not i" etc.
Lets say that the letter is i. Then the letter is not a, so that returns "True". Then the entire statement is True because i != a. I think what you are looking for is to AND the statements together, not OR them.
Once you do this, you need to look at how to increment y and check this again. If the first time you get a vowel, you want to see if the next character is a vowel too, or not. This only checks the character at location y=3.
String word="Jaemeas";
String wordT="";
int y=3;
char z;
do{
z=word.charAt(y);
if(z!='a'&&z!='e'&&z!='i'&&z!='o'&&z!='u'&&y<word.length()){
for(int i = 0; i<=y;i++){
wordT=wordT+word.charAt(i);
}
break;
}
else{
y++;
}
}while(true);
here is my answer.
I have declared a char[] constant for the VOWELS, then implemented a method that checks whether a char is a vowel or not (returning a boolean value). In my main method, I am declaring a string and converting it to an array of chars, so that I can pass the index of the char array as the parameter of my isVowel method:
public class FindVowelsInString {
static final char[] VOWELS = {'a', 'e', 'i', 'o', 'u'};
public static void main(String[] args) {
String str = "hello";
char[] array = str.toCharArray();
//Check with a consonant
boolean vowelChecker = FindVowelsInString.isVowel(array[0]);
System.out.println("Is this a character a vowel?" + vowelChecker);
//Check with a vowel
boolean vowelChecker2 = FindVowelsInString.isVowel(array[1]);
System.out.println("Is this a character a vowel?" + vowelChecker2);
}
private static boolean isVowel(char vowel) {
boolean isVowel = false;
for (int i = 0; i < FindVowelsInString.getVowel().length; i++) {
if (FindVowelsInString.getVowel()[i] == vowel) {
isVowel = true;
}
}
return isVowel;
}
public static char[] getVowel() {
return FindVowelsInString.VOWELS;
}
}
Is there a better, more elegant (and/or possibly faster) way than
boolean isNumber = false;
try{
Double.valueOf(myNumber);
isNumber = true;
} catch (NumberFormatException e) {
}
...?
Edit:
Since I can't pick two answers I'm going with the regex one because a) it's elegant and b) saying "Jon Skeet solved the problem" is a tautology because Jon Skeet himself is the solution to all problems.
I don't believe there's anything built into Java to do it faster and still reliably, assuming that later on you'll want to actually parse it with Double.valueOf (or similar).
I'd use Double.parseDouble instead of Double.valueOf to avoid creating a Double unnecessarily, and you can also get rid of blatantly silly numbers quicker than the exception will by checking for digits, e/E, - and . beforehand. So, something like:
public boolean isDouble(String value)
{
boolean seenDot = false;
boolean seenExp = false;
boolean justSeenExp = false;
boolean seenDigit = false;
for (int i=0; i < value.length(); i++)
{
char c = value.charAt(i);
if (c >= '0' && c <= '9')
{
seenDigit = true;
continue;
}
if ((c == '-' || c=='+') && (i == 0 || justSeenExp))
{
continue;
}
if (c == '.' && !seenDot)
{
seenDot = true;
continue;
}
justSeenExp = false;
if ((c == 'e' || c == 'E') && !seenExp)
{
seenExp = true;
justSeenExp = true;
continue;
}
return false;
}
if (!seenDigit)
{
return false;
}
try
{
Double.parseDouble(value);
return true;
}
catch (NumberFormatException e)
{
return false;
}
}
Note that despite taking a couple of tries, this still doesn't cover "NaN" or hex values. Whether you want those to pass or not depends on context.
In my experience regular expressions are slower than the hard-coded check above.
You could use a regex, i.e. something like String.matches("^[\\d\\-\\.]+$"); (if you're not testing for negative numbers or floating point numbers you could simplify a bit).
Not sure whether that would be faster than the method you outlined though.
Edit: in the light of all this controversy, I decided to make a test and get some data about how fast each of these methods were. Not so much the correctness, but just how quickly they ran.
You can read about my results on my blog. (Hint: Jon Skeet FTW).
See java.text.NumberFormat (javadoc).
NumberFormat nf = NumberFormat.getInstance(Locale.FRENCH);
Number myNumber = nf.parse(myString);
int myInt = myNumber.intValue();
double myDouble = myNumber.doubleValue();
The correct regex is actually given in the Double javadocs:
To avoid calling this method on an invalid string and having a NumberFormatException be thrown, the regular expression below can be used to screen the input string:
final String Digits = "(\\p{Digit}+)";
final String HexDigits = "(\\p{XDigit}+)";
// an exponent is 'e' or 'E' followed by an optionally
// signed decimal integer.
final String Exp = "[eE][+-]?"+Digits;
final String fpRegex =
("[\\x00-\\x20]*"+ // Optional leading "whitespace"
"[+-]?(" + // Optional sign character
"NaN|" + // "NaN" string
"Infinity|" + // "Infinity" string
// A decimal floating-point string representing a finite positive
// number without a leading sign has at most five basic pieces:
// Digits . Digits ExponentPart FloatTypeSuffix
//
// Since this method allows integer-only strings as input
// in addition to strings of floating-point literals, the
// two sub-patterns below are simplifications of the grammar
// productions from the Java Language Specification, 2nd
// edition, section 3.10.2.
// Digits ._opt Digits_opt ExponentPart_opt FloatTypeSuffix_opt
"((("+Digits+"(\\.)?("+Digits+"?)("+Exp+")?)|"+
// . Digits ExponentPart_opt FloatTypeSuffix_opt
"(\\.("+Digits+")("+Exp+")?)|"+
// Hexadecimal strings
"((" +
// 0[xX] HexDigits ._opt BinaryExponent FloatTypeSuffix_opt
"(0[xX]" + HexDigits + "(\\.)?)|" +
// 0[xX] HexDigits_opt . HexDigits BinaryExponent FloatTypeSuffix_opt
"(0[xX]" + HexDigits + "?(\\.)" + HexDigits + ")" +
")[pP][+-]?" + Digits + "))" +
"[fFdD]?))" +
"[\\x00-\\x20]*");// Optional trailing "whitespace"
if (Pattern.matches(fpRegex, myString))
Double.valueOf(myString); // Will not throw NumberFormatException
else {
// Perform suitable alternative action
}
This does not allow for localized representations, however:
To interpret localized string representations of a floating-point value, use subclasses of NumberFormat.
Use StringUtils.isDouble(String) in Apache Commons.
Leveraging off Mr. Skeet:
private boolean IsValidDoubleChar(char c)
{
return "0123456789.+-eE".indexOf(c) >= 0;
}
public boolean isDouble(String value)
{
for (int i=0; i < value.length(); i++)
{
char c = value.charAt(i);
if (IsValidDoubleChar(c))
continue;
return false;
}
try
{
Double.parseDouble(value);
return true;
}
catch (NumberFormatException e)
{
return false;
}
}
Most of these answers are somewhat acceptable solutions. All of the regex solutions have the issue of not being correct for all cases you may care about.
If you really want to ensure that the String is a valid number, then I would use your own solution. Don't forget that, I imagine, that most of the time the String will be a valid number and won't raise an exception. So most of the time the performance will be identical to that of Double.valueOf().
I guess this really isn't an answer, except that it validates your initial instinct.
Randy
I would use the Jakarta commons-lang, as always ! But I have no idea if their implementation is fast or not. It doesnt rely on Exceptions, which might be a good thig performance wise ...
Following Phill's answer can I suggest another regex?
String.matches("^-?\\d+(\\.\\d+)?$");
I prefer using a loop over the Strings's char[] representation and using the Character.isDigit() method. If elegance is desired, I think this is the most readable:
package tias;
public class Main {
private static final String NUMERIC = "123456789";
private static final String NOT_NUMERIC = "1L5C";
public static void main(String[] args) {
System.out.println(isStringNumeric(NUMERIC));
System.out.println(isStringNumeric(NOT_NUMERIC));
}
private static boolean isStringNumeric(String aString) {
if (aString == null || aString.length() == 0) {
return false;
}
for (char c : aString.toCharArray() ) {
if (!Character.isDigit(c)) {
return false;
}
}
return true;
}
}
If you want something that's blisteringly fast, and you have a very clear idea of what formats you want to accept, you can build a state machine DFA by hand. This is essentially how regexes work under the hood anyway, but you can avoid the regex compilation step this way, and it may well be faster than a generic regex compiler.