I have written regex from numbers from 0 to 31. It shall not allow preceding zeros.
[0-2]\\d|/3[0-2]
But it also allows preceding zeros.
01 invalid
02 invalid
Can some tell me how to fix this.
You can use the following regex:
^(?:[0-9]|[12][0-9]|3[01])$
See demo
Your regex - [0-2]\\d|/3[0-2] - contains 2 alternatives: 1) [0-2]\\d matches a digit from 0-2 range first and then any 1 digit (with \\d), and 2) /3[0-2] matches /, then 3 and then 1 digit from 0-2 range. What is important is that without anchors (^ and $) this expression will match substrings in longer strings, and will match 01 in 010.
Since there has been some discussion about shorthand classes, here is a version with the shorthand class and here is also an example with matches() that requires full input to match and thus we do not need explicit anchors:
String pttrn = "(?:\\d|[12]\\d|3[01])";
System.out.println("31".matches(pttrn));
See demo
Note that the backslash should be doubled here.
You can try with the following pattern:
^(?:[12]?[0-9]|3[01])$
Just another non-Regex approach with data validation before attempting to convert a String to int. Here we are validating that the data is at least 1 character that is a digit, or the data is 2 characters that are digits and the first character is not a 0.
public static void main(String[] args) throws Exception {
List<String> data = new ArrayList() {{
add("01"); // Bad
add("1A"); // Bad
add("123"); // Bad
add("31"); // Good
add("-1"); // Bad
add("32"); // Bad
add("0"); // Good
add("15"); // Good
}};
for (String d : data) {
boolean valid = true;
if (d.isEmpty()) {
valid = false;
} else {
char firstChar = d.charAt(0);
if ((d.length() == 1 && Character.isDigit(firstChar)) ||
(d.length() == 2 &&
(Character.isDigit(firstChar) && firstChar != '0' &&
Character.isDigit(d.charAt(1))))) {
int myInt = Integer.parseInt(d);
valid = (0 <= myInt && myInt <= 31);
} else {
valid = false;
}
}
System.out.println(valid ? "Valid" : "Invalid");
}
}
Results:
Invalid
Invalid
Invalid
Valid
Invalid
Invalid
Valid
Valid
Another option:
\\b(?:[12]?\\d|3[12])\\b
Demo
This regex does not use none-capturing group:
^(\d|[12]\d|3[01])$
Explanation:
^ - start of line \d - single digit 0-9 or [12]\d - tens
and twenties or 3[01] - thirty and thirty one $ - line end
Java DEMO
It is harder to maintain code with regex in it: see When you should not use Regular Expressions
In order to make your code more maintainable and easier for other developers to jump into and support, maybe you could consider converting your String to an Integer and then testing the value?
if((!inputString.startsWith("0") && inputString.length() == 2) || inputString.length() == 1){
Integer myInt = Integer.parseInt(inputString);
if( 0 <= myInt && myInt <= 31){
//execute logic...
}
}
you could also easily break this out into a utility method that is very descriptive such as:
private boolean isBetween0And31Inclusive(String inputString){
try{
if((!inputString.startsWith("0") && inputString.length() == 2) || inputString.length() == 1){
Integer myInt = Integer.parseInt(inputString);
if(0 <= myInt && myInt <= 31){
return true;
}
}
return false;
}catch(NumberFormatException exception){
return false;
}
}
Related
I would like to implement a regular expression that return true if:
The string contain only number
The string does not contain only 0 ( like 0000)
For example:
1230456 => true
888822200000 => true
00000000 => false
fff => false
I started to implement this
private static final String ARTICLE_VALID_FORMAT = "\\d";
private static final Pattern ARTICLE_VALID_FORMAT_PATTERN = Pattern.compile(ARTICLE_VALID_FORMAT);
private boolean isArticleHasValidFormat(String article) {
return StringUtils.isNotBlank(article) && ARTICLE_VALID_FORMAT_PATTERN.matcher(article).matches();
}
Now, it returns true if the article has only number. but i would like to test also if it is not all 0.
How to do that?
Thanks
You can use:
private static final String ARTICLE_VALID_FORMAT = "[0-9]*?[1-9][0-9]*";
which means:
Match zero or more digits; the ? means to match as few as possible before moving onto the next part
then one digit that's not a zero
then zero or more digits
Or, as Joachim Sauer suggested in comments:
private static final String ARTICLE_VALID_FORMAT = "0*[1-9][0-9]*";
which means:
Match zero or more zeros
then one digit that's not a zero
then zero or more digits
If you wanted to do it without regex, you could use (among many other ways):
string.chars().allMatch(c -> c >= '0' && c <= '9')
&& string.chars().anyMatch(c -> c != '0')
The regex pattern \d*[1-9]\d* as given by #AndyTurner is a good way to do this. Another approach would be to try to parse the string input to a long, and then check that it is greater than zero:
private boolean isArticleHasValidFormat(String article) {
try {
if (Long.parseLong(article) > 0) return true;
}
catch (NumberFormatException e) {
}
return false;
}
This solution assumes that you are only concerned with finding positive numbers. If not, and you want to cater to negatives, then check num != 0 instead.
Try this condition.
(Integer.pasrseInt("0" + article.replaceAll("^[0-9]", "0")) != 0) ? true : false
the ["0" +] is to avoid NumberFormatException for empty string
You don't need to make a Pattern object. Just call matches function from the String class
article.matches("\\d*[1-9]\\d*");
It's the same regex as Andy Turner suggested.
I got a problem and I think it is in comparing a char with a number.
String FindCountry = "BB";
Map<String, String> Cont = new HashMap <> ();
Cont.put("BA-BE", "Angola");
Cont.put("9X-92", "Trinidad & Tobago");
for ( String key : Cont.keySet()) {
if (key.charAt(0) == FindCountry.charAt(0) && FindCountry.charAt(1) >= key.charAt(1) && FindCountry.charAt(1) <= key.charAt(4)) {
System.out.println("Country: "+ Cont.get(key));
}
}
In this case the code print "Angola", but if
String FindCountry = "9Z"
it doesn't print anything. I am not sure I think the problem is in that it can't compare that is '2' greater than 'Z'. In that example, I got only two Cont.put(), but in my file, I got much more and a lot of them are not only with chars. I got a problem with them.
What is the smartest and best way to compare char with a number ? Actually, if I set a rule like "1" is greater than "Z" it will be okay because I need this way of greater: A-Z-9-0.
Thanks!
You can use a lookup "table", I used a String:
private static final String LOOKUP = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
And then compare the chars with indexOf(), but it seems messy and could probably be achieved more easily, I just can't come up with something easier at the moment:
String FindCountry = "9Z";
Map<String, String> Cont = new HashMap<>();
Cont.put("BA-BE", "Angola");
Cont.put("9X-92", "Trinidad & Tobago");
for (String key : Cont.keySet()) {
if (LOOKUP.indexOf(key.charAt(0)) == LOOKUP.indexOf(FindCountry.charAt(0)) &&
LOOKUP.indexOf(FindCountry.charAt(1)) >= LOOKUP.indexOf(key.charAt(1)) &&
LOOKUP.indexOf(FindCountry.charAt(1)) <= LOOKUP.indexOf(key.charAt(4))) {
System.out.println("Country: " + Cont.get(key));
}
}
If you only use the characters A-Z and 0-9, you could add a conversion method in between which will increase the values of the 0-9 characters so they'll be after A-Z:
int applyCharOrder(char c){
// If the character is a digit:
if(c < 58){
// Add 43 to put it after the 'Z' in terms of decimal unicode value:
return c + 43;
}
// If it's an uppercase letter instead: simply return it as is
return c;
}
Which can be used like this:
if(applyCharOrder(key.charAt(0)) == applyCharOrder(findCountry.charAt(0))
&& applyCharOrder(findCountry.charAt(1)) >= applyCharOrder(key.charAt(1))
&& applyCharOrder(findCountry.charAt(1)) <= applyCharOrder(key.charAt(4))){
System.out.println("Country: "+ cont.get(key));
}
Try it online.
Note: Here is a table with the decimal unicode values. Characters '0'-'9' will have the values 48-57 and 'A'-'Z' will have the values 65-90. So the < 58 is used to check if it's a digit-character, and the + 43 will increase the 48-57 to 91-100, putting their values above the 'A'-'Z' so your <= and >= checks will work as you'd want them to.
Alternatively, you could create a look-up String and use its index for the order:
int applyCharOrder(char c){
return "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789".indexOf(c);
}
Try it online.
PS: As mentioned in the first comment by #Stultuske, variables are usually in camelCase, so they aren't starting with an uppercase letter.
As the others stated in the comments, such mathematical comparison operations on characters are based on the actual ASCII values of each char. So I'd suggest you refactor your logic using the ASCII table as reference.
Want to find whether my string contains every digit from 0 to 9 or not. I am currently using following logic :
if (str.contains("0") && str.contains("1") && str.contains("2") && str.contains("3") && str.contains("4") && str.contains("5") && str.contains("6") && str.contains("7") && str.contains("8") && str.contains("9"))
{
return true;
}
I believe this will not be very optimized if string is too big. How can I use a pattern and find using String.matches whether it has all numbers or not through regex ?
This is not a duplicate of most other regex questions in the forum wherein 'OR' related char patterns are discussed, here we're talking about 'AND'. I need whether a string contains each of the given characters (i.e. digits) or not. Hope it clarifies.
Thanks,
Rajiv
I would not recommend a regex for this task as it won't look elegant. It will look like (hover mouse over to see the spoiler):
str.matches("(?s)(?=[^1]*1)(?=[^2]*2)(?=[^3]*3)(?=[^4]*4)(?=[^5]*5)(?=[^6]*6)(?=[^7]*7)(?=[^8]*8)(?=[^9]*9)(?=[^0]*0).*")
Instead, in Java 8, you can use
bool result = s.chars().filter(i -> i >= '0' && i <= '9').distinct().count() == 10;
It filters all the string characters (s.chars()) that are digits (.filter(i -> i >= '0' && i <= '9')), only keeps unique occurrences (with .distinct()), and then checks their count with .count(). If the count is equal to 10, there are all ten ASCII digits.
So, the following code:
String s = "1-234-56s78===90";
System.out.println(s.chars().filter(i -> i >= '0' && i <= '9').distinct().count() == 10);
prints true.
I made a method to remove some punctuation from a String but its just returning the word passed in the parameter w/ the punctuation included, can anyone spot what's wrong?
public static String removePunctuation(String word) {
if (word.charAt(0) >= 32 && word.charAt(0) <= 46) {
if(word.length() == 1){
return "";
} else {
return removePunctuation(word.substring(1));
}
} else {
if(word.length() == 1){
return "" + word.charAt(0);
} else {
return word.charAt(0) + removePunctuation(word.substring(1));
}
}
}//end method
I ran the code you provided with the input:
h.ello and got the output hello
I am seeing a fully functional method here. I presume the punctuation you are trying to remove is not part of the ASCII range you provided in the if statement. Check your ASCII values against a chart.
ASCII values chart to compare with
Without including the proper values the input:
h[ello will return the output h[ello because the [ is ASCII value 91, which is outside the range you provided:
>= 32 && <= 46
There is nothing wrong with your algorithm. Most likely your range (32-46) doesn't include all the punctuation you're trying to remove. For example, ? is 63, so it will not get removed.
Specification for a syllable:
Each group of adjacent vowels (a, e, i, o, u, y) counts as one syllable (for example, the "ea" in "real" contributes one syllable, but the "e...a" in "regal" counts as two syllables). However, an "e" at the end of a word doesn't count as a syllable. Also each word has at least one syllable, even if the previous rules give a count of zero.
My countSyllables method:
public int countSyllables(String word) {
int count = 0;
word = word.toLowerCase();
for (int i = 0; i < word.length(); i++) {
if (word.charAt(i) == '\"' || word.charAt(i) == '\'' || word.charAt(i) == '-' || word.charAt(i) == ',' || word.charAt(i) == ')' || word.charAt(i) == '(') {
word = word.substring(0,i)+word.substring(i+1, word.length());
}
}
boolean isPrevVowel = false;
for (int j = 0; j < word.length(); j++) {
if (word.contains("a") || word.contains("e") || word.contains("i") || word.contains("o") || word.contains("u")) {
if (isVowel(word.charAt(j)) && !((word.charAt(j) == 'e') && (j == word.length()-1))) {
if (isPrevVowel == false) {
count++;
isPrevVowel = true;
}
} else {
isPrevVowel = false;
}
} else {
count++;
break;
}
}
return count;
}
The isVowel method which determines if a letter is a vowel:
public boolean isVowel(char c) {
if (c == 'a' || c == 'e' || c == 'i' || c == 'o' || c == 'u') {
return true;
} else {
return false;
}
}
According to a colleague, this should result in 528 syllables when used on this text, but I can seem to get it to equal that and I don't know which of us is correct. Please help me develop my method into the correct algorithm or help show this is correct. Thank you.
One of the problem might be that you call to lover case method on the input, but you do not assign it.
So if you change
word.toLowerCase();
to
word = word.toLowerCase();
will help for sure.
I have just invented a new way to count syllables in Java.
My new library, The Lawrence Style Checker, can be viewed here: https://github.com/troywatson/Lawrence-Style-Checker
I counted your syllables for every word using my program and displayed the results here: http://pastebin.com/LyiBTcbb
With my dictionary method of counting syllables I got: 528 syllables total.
This is the exact number the questioner gave of the correct number of syllables. Yet I still dispute this number for reasons described below:
Strike rate: 99.4% correct
Words wrong: 2 / 337 words
Words wrong and wrong syllable counts: {resinous: 4, aardwolf: 3}
Here is my code:
Lawrence lawrence = new Lawrence();
// Turn the text into an array of sentences.
String sentences = ""
String[] sentences2 = sentences.split("(?<=[a-z])\\.\\s+");
int count = 0;
for (String sentence : sentences2) {
sentence = sentence.replace("-", " "); // split double words
for (String word : sentence.split(" ")) {
// Get rid of punctuation marks and spaces.
word = lawrence.cleanWord(word);
// If the word is null, skip it.
if (word.length() < 1)
continue;
// Print out the word and it's syllable on one line.
System.out.print(word + ",");
System.out.println(lawrence.getSyllable(word));
count += lawrence.getSyllable(word);
}
}
System.out.println(count);
bam!
This should be easily doable with some Regex:
Pattern p = Pattern.compile("[aeiouy]+?\w*?[^e]");
String[] result = p.split(WHAT_EVER_THE_INPUT_IS);
result.length
Please note, that it is untested.
Not a direct answer (and I would give you one if I thought it was constructive, my count is about 238 in the last try) but I will give you a few hints that will be fundamental to creating the answer:
Divide up your problem: Read lines, then split the lines up into words, then count the syllables for each word. Afterwords, count them up for all the lines.
Think about the order of things: first find all the syllables, and count each one by "walking" through the word. Factor in the special cases afterwards.
During design, use a debugger to step through your code. Chances are pretty high you make common mistakes like the toUpperCase() method. Better find those errors, nobody will create perfect code the first time around.
Print to console (advanced users use a log and keep the silenced log lines in the final program). Make sure to mark the println's using comments and remove them from the final implementation. Print things like line numbers and syllable counts so you can visually compare them with the text.
If you have advanced a bit, you may use Matcher.find (regular expressions) using a Pattern to find the syllables. Regular expressions are difficult beasts to master. One common mistake is have them do too much in a go.
This way you can quickly scan the text. One of the things you quickly will find out is that you will have to deal with the numbers in the text. So you need to check if a word is actually a word, otherwise, by your rules, it will have at least a single syllable.
If you have the feeling you are repeating things, like the isVowel and String.contains() methods using the same set of characters, you are probably doing something wrong. Repetition in source code is code smell.
Using regexps, I counted about 238 (in the 4th go), but I haven't really checked each and every syllable (of course).
1 14
2 17
3 17
4 15
5 15
6 14
7 16
8 19
9 17
10 17
11 16
12 19
13 18
14 15
15 18
16 15
17 16
18 17
19 16
20 17
21 17
22 19
23 17
24 16
25 17
26 17
27 16
28 17
29 15
30 17
31 19
32 23
33 0
--- total ---
538
I would strongly suggest that you use Java's String API to its full ability. For example, consider String.split(String regex):
http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#split%28java.lang.String%29
This takes a String, and a regular expression, then returns an array of all the substrings, using your regular expression as a delimeter. If you make your regular expression match all consonants or whitespace, then you will end up with an array of Strings which are either empty (and therefore do not represent a consonant) or a sequence of vowels (which do represent a consonant). Count up the latter, and you will have a solution.
Another alternative which also takes advantage of the String API and regular expressions is replaceAll:
http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#replaceAll%28java.lang.String,%20java.lang.String%29
In this case, you want a regular expression which takes the form [optional anything which isn't a vowel][one or more vowels][optional anything which isn't a vowel]. Run this regular expression on your String, and replace it with a single character (eg "1"). The end result is that each syllable will be replaced by a single character. Then all you need to do is String.length() and you'll know how many syllables you had.
Depending on the requirements of your solution, these may not work. If this is a homework question relating to algorithm design, this is almost certainly not the preferred answer, but it does have the benefit of being concise and makes good use of the built-in (and therefore highly optimized) Java APIs.
private static int countSyllables(String word)
{
//System.out.print("Counting syllables in " + word + "...");
int numSyllables = 0;
boolean newSyllable = true;
String vowels = "aeiouy";
char[] cArray = word.toCharArray();
for (int i = 0; i < cArray.length; i++)
{
if (i == cArray.length-1 && Character.toLowerCase(cArray[i]) == 'e'
&& newSyllable && numSyllables > 0) {
numSyllables--;
}
if (newSyllable && vowels.indexOf(Character.toLowerCase(cArray[i])) >= 0) {
newSyllable = false;
numSyllables++;
}
else if (vowels.indexOf(Character.toLowerCase(cArray[i])) < 0) {
newSyllable = true;
}
}
//System.out.println( "found " + numSyllables);
return numSyllables;
}
Another implementation can be found at below pastebin link:
https://pastebin.com/q6rdyaEd
This is my implementation for counting syllables
protected int countSyllables(String word)
{
// getNumSyllables method in BasicDocument (module 1) and
// EfficientDocument (module 2).
int syllables = 0;
word = word.toLowerCase();
if(word.contains("the ")){
syllables ++;
}
String[] split = word.split("e!$|e[?]$|e,|e |e[),]|e$");
ArrayList<String> tokens = new ArrayList<String>();
Pattern tokSplitter = Pattern.compile("[aeiouy]+");
for (int i = 0; i < split.length; i++) {
String s = split[i];
Matcher m = tokSplitter.matcher(s);
while (m.find()) {
tokens.add(m.group());
}
}
syllables += tokens.size();
return syllables;
}
It works fine for me.