This question already has answers here:
String replace method is not replacing characters
(5 answers)
Closed 2 years ago.
I'm struggling to understand what's wrong with my code for this Leetcode problem.
Problem: Given a string, determine if it is a palindrome, considering only alphanumeric characters and ignoring cases.
Right now, I am passing 108/476 cases, and I am failing this test: "A man, a plan, a canal: Panama".
Here is my code, please help me identify the problem!
class Solution {
public boolean isPalindrome(String s) {
if (s.isEmpty()) return true;
s.replaceAll("\\s+","");
int i = 0;
int j = s.length() - 1;
while (i <= j) {
if (Character.toLowerCase(s.charAt(i)) != Character.toLowerCase(s.charAt(j))) {
return false;
}
i++;
j--;
}
return true;
}
}
Your replaceAll method is incorrect
Your replaceAll method currently only removes spaces. It should remove all the special characters and keep only letters. If we use the regex way like you do, this is (one of) the best regex to use:
s = s.replaceAll("[^a-zA-Z]+","");
You could be tempted to use the \W (or [^\w]) instead, but this latest regex matches [a-zA-Z0-9_], including digits and the underscore character. Is this what you want? then go and use \W instead. If not, stick to [^a-zA-Z].
If you want to match all the letters, no matter the language, use the following:
s = s.replace("\\P{L}", "");
Note that you could shorten drastically your code like this, although it's definitely not the fastest:
class Solution {
public boolean isPalindrome(String s) {
s = s.replaceAll("\\P{L}", "");
return new StringBuilder(s).reverse().toString().equalsIgnoreCase(s);
}
}
Your regex is invalid. Try this:
s = s.replaceAll("[\\W]+", "");
\W is used for anything that is not alphanumeric.
By s.replaceAll("\\s+",""); you are only removing the spaces but you also have to remove anything except alphanumeric characters such as punctuation, in this case ,.
Related
I was going through the answers of this question asked by someone previously and I found them to be very helpful. However, I have a question about the highlighted answer but I wasn't sure if I should ask there since it's a 6 year old thread.
My question is about this snippet of code given in the answers:
private static boolean isAWord(String token)
{
//check if the token is a word
}
How would you check that the token is a word? Would you .contains("\\s+") the string and check to see if it contains characters between them? But what about when you encounter a paragraph? I'm not sure how to go about this.
EDIT: I think I should've elaborated a bit more. Usually, you'd think a word would be something surrounded by " " but, for example, if the file contains a hyphen (which is also surrounded by a blank space), you'd want the isAWord() method to return false. How can I verify that something is actually a word and not punctuation?
Since the question wasn't entirely clear, I made two methods. First method consistsOfLetters just goes through the whole string and returns false if it has any numbers/symbols. This should be enough to determine if a token is word (if you don't mind if that words exists in dictionary or not).
public static boolean consistsOfLetters(String string) {
for(int i=0; i<string.length(); i++) {
if(string.charAt(i) == '.' && (i+1) == string.length() && string.length() != 1) break; // if last char of string is ., it is still word
if((string.toLowerCase().charAt(i) < 'a' || string.toLowerCase().charAt(i) > 'z')) return false;
} // toLowerCase is used to avoid having to compare it to A and Z
return true;
}
Second method helps us divide original String (for example a sentence of potentional words) based on " " character. When that is done, we go through every element there and check if it is a word. If it's not a word it returns false and skips the rest. If everything is fine, returns true.
public static boolean isThisAWord(String string) {
String[] array = string.split(" ");
for(int i = 0; i < array.length; i++) {
if(consistsOfLetters(array[i]) == false) return false;
}
return true;
}
Also, this might not work for English since English has apostrophes in words like "don't" so a bit of further tinkering is needed.
The Scanner in java splits string using his WHITESPACE_PATTERN by default, so splitting a string like "He's my friend" would result in an array like ["He's", "my", "friend"].
If that is sufficient, just remove that if clause and dont use that method.
If you want to make it to "He","is" instead of "He's", you need a different approach.
In short: The method works like verification check -> if the given token is not supposed to be in the result, then return false, true otherwise.
return token.matches("[\\pL\\pM]+('(s|nt))?");
matches requires the entire string to match.
This takes letters \pL and zero-length combining diacritical marks \pM (accents).
And possibly for English apostrophe, should you consider doesn't and let's one term (for instance for translation purposes).
You might also consider hyphens.
There are several single quotes and dashes.
Path path = Paths.get("..../x.txt");
Charset charset = Charset.defaultCharset();
String content = Files.readString(path, charset)
Pattern wordPattern = Pattern.compile("[\\pL\\pM]+");
Matcher m = wordPattern.matcher(content);
while (m.find()) {
String word = m.group(); ...
}
This question already has answers here:
Regex to replace repeated characters
(2 answers)
Closed 6 years ago.
I am trying to replace all the repeated characters from a String in Java, and let only one.
For example:
aaaaa ---> a
For that, I have tried using the replaceAll method:
"aaaaa".replaceAll("a*","a") //returns "aa"
I have developed a recursive method, which is probably not very efficient:
public String recursiveReplaceAll(String original,String regex, String replacement) {
if (original.equals(original.replaceAll(regex, replacement))) return original;
return recursiveReplaceAll(original.replaceAll(regex, replacement),regex,replacement);
}
This method works, I was just wondering if there was anything using RegEx for example, which does the work with better performance.
Your replaceAll approach was nearly right - it's just that * matches 0 occurrences. You want + to mean "one or more".
"aaaaa".replaceAll("a+","a") // Returns "a"
You can do it without recursion. The regular expression "(.)\\1+" will capture every character followed by themselves at least once, and it replaces them with the captured character. Thus, this removes any repeated characters.
public static void main(String[] args) {
String str = "aaaabbbaaa";
String result = str.replaceAll("(.)\\1+", "$1");
System.out.println(result); // prints "aba".
}
With this, it works for all characters.
This question already has answers here:
How to check whether a string contains at least one alphabet in java?
(2 answers)
Closed 8 years ago.
I very new to programming. I want to check if a string s contains a-z characters. I use:
if(s.contains("a") || s.contains("b") || ... {
}
but is there any way for this to be done in shorter code? Thanks a lot
You can use regular expressions
// to emulate contains, [a-z] will fail on more than one character,
// so you must add .* on both sides.
if (s.matches(".*[a-z].*")) {
// Do something
}
this will check if the string contains at least one character a-z
to check if all characters are a-z use:
if ( ! s.matches(".*[^a-z].*") ) {
// Do something
}
for more information on regular expressions in java
http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html
In addition to regular expressions, and assuming you actually want to know if the String doesn't contain only characters, you can use Character.isLetter(char) -
boolean hasNonLetters = false;
for (char ch : s.toCharArray()) {
if (!Character.isLetter(ch)) {
hasNonLetters = true;
break;
}
}
// hasNonLetters is true only if the String contains something that isn't a letter -
From the Javadoc for Character.isLetter(char),
A character is considered to be a letter if its general category type, provided by Character.getType(ch), is any of the following:
UPPERCASE_LETTER
LOWERCASE_LETTER
TITLECASE_LETTER
MODIFIER_LETTER
OTHER_LETTER
Use Regular Expressions. The Pattern.matches() method can do this easily. For example:
Pattern.matches("[a-z]", "TESTING STRING a");
If you need to check a great number of string this class can be compiled internally to improve performance.
Try this
Pattern p = Pattern.compile("[a-z]");
if (p.matcher(stringToMatch).find()) {
//...
}
In Java for String class there is a method called matches, how to use this method to check if my string is having only digits using regular expression. I tried with below examples, but both of them returned me false as result.
String regex = "[0-9]";
String data = "23343453";
System.out.println(data.matches(regex));
String regex = "^[0-9]";
String data = "23343453";
System.out.println(data.matches(regex));
Try
String regex = "[0-9]+";
or
String regex = "\\d+";
As per Java regular expressions, the + means "one or more times" and \d means "a digit".
Note: the "double backslash" is an escape sequence to get a single backslash - therefore, \\d in a java String gives you the actual result: \d
References:
Java Regular Expressions
Java Character Escape Sequences
Edit: due to some confusion in other answers, I am writing a test case and will explain some more things in detail.
Firstly, if you are in doubt about the correctness of this solution (or others), please run this test case:
String regex = "\\d+";
// positive test cases, should all be "true"
System.out.println("1".matches(regex));
System.out.println("12345".matches(regex));
System.out.println("123456789".matches(regex));
// negative test cases, should all be "false"
System.out.println("".matches(regex));
System.out.println("foo".matches(regex));
System.out.println("aa123bb".matches(regex));
Question 1:
Isn't it necessary to add ^ and $ to the regex, so it won't match "aa123bb" ?
No. In java, the matches method (which was specified in the question) matches a complete string, not fragments. In other words, it is not necessary to use ^\\d+$ (even though it is also correct). Please see the last negative test case.
Please note that if you use an online "regex checker" then this may behave differently. To match fragments of a string in Java, you can use the find method instead, described in detail here:
Difference between matches() and find() in Java Regex
Question 2:
Won't this regex also match the empty string, "" ?*
No. A regex \\d* would match the empty string, but \\d+ does not. The star * means zero or more, whereas the plus + means one or more. Please see the first negative test case.
Question 3
Isn't it faster to compile a regex Pattern?
Yes. It is indeed faster to compile a regex Pattern once, rather than on every invocation of matches, and so if performance implications are important then a Pattern can be compiled and used like this:
Pattern pattern = Pattern.compile(regex);
System.out.println(pattern.matcher("1").matches());
System.out.println(pattern.matcher("12345").matches());
System.out.println(pattern.matcher("123456789").matches());
You can also use NumberUtil.isNumber(String str) from Apache Commons
Using regular expressions is costly in terms of performance. Trying to parse string as a long value is inefficient and unreliable, and may be not what you need.
What I suggest is to simply check if each character is a digit, what can be efficiently done using Java 8 lambda expressions:
boolean isNumeric = someString.chars().allMatch(x -> Character.isDigit(x));
One more solution, that hasn't been posted, yet:
String regex = "\\p{Digit}+"; // uses POSIX character class
You must allow for more than a digit (the + sign) as in:
String regex = "[0-9]+";
String data = "23343453";
System.out.println(data.matches(regex));
Long.parseLong(data)
and catch exception, it handles minus sign.
Although the number of digits is limited this actually creates a variable of the data which can be used, which is, I would imagine, the most common use-case.
We can use either Pattern.compile("[0-9]+.[0-9]+") or Pattern.compile("\\d+.\\d+"). They have the same meaning.
the pattern [0-9] means digit. The same as '\d'.
'+' means it appears more times.
'.' for integer or float.
Try following code:
import java.util.regex.Pattern;
public class PatternSample {
public boolean containNumbersOnly(String source){
boolean result = false;
Pattern pattern = Pattern.compile("[0-9]+.[0-9]+"); //correct pattern for both float and integer.
pattern = Pattern.compile("\\d+.\\d+"); //correct pattern for both float and integer.
result = pattern.matcher(source).matches();
if(result){
System.out.println("\"" + source + "\"" + " is a number");
}else
System.out.println("\"" + source + "\"" + " is a String");
return result;
}
public static void main(String[] args){
PatternSample obj = new PatternSample();
obj.containNumbersOnly("123456.a");
obj.containNumbersOnly("123456 ");
obj.containNumbersOnly("123456");
obj.containNumbersOnly("0123456.0");
obj.containNumbersOnly("0123456a.0");
}
}
Output:
"123456.a" is a String
"123456 " is a String
"123456" is a number
"0123456.0" is a number
"0123456a.0" is a String
According to Oracle's Java Documentation:
private static final Pattern NUMBER_PATTERN = Pattern.compile(
"[\\x00-\\x20]*[+-]?(NaN|Infinity|((((\\p{Digit}+)(\\.)?((\\p{Digit}+)?)" +
"([eE][+-]?(\\p{Digit}+))?)|(\\.((\\p{Digit}+))([eE][+-]?(\\p{Digit}+))?)|" +
"(((0[xX](\\p{XDigit}+)(\\.)?)|(0[xX](\\p{XDigit}+)?(\\.)(\\p{XDigit}+)))" +
"[pP][+-]?(\\p{Digit}+)))[fFdD]?))[\\x00-\\x20]*");
boolean isNumber(String s){
return NUMBER_PATTERN.matcher(s).matches()
}
Refer to org.apache.commons.lang3.StringUtils
public static boolean isNumeric(CharSequence cs) {
if (cs == null || cs.length() == 0) {
return false;
} else {
int sz = cs.length();
for(int i = 0; i < sz; ++i) {
if (!Character.isDigit(cs.charAt(i))) {
return false;
}
}
return true;
}
}
In Java for String class, there is a method called matches(). With help of this method you can validate the regex expression along with your string.
String regex = "^[\\d]{4}$";
String value = "1234";
System.out.println(data.matches(value));
The Explanation for the above regex expression is:-
^ - Indicates the start of the regex expression.
[] - Inside this you have to describe your own conditions.
\\\d - Only allows digits. You can use '\\d'or 0-9 inside the bracket both are same.
{4} - This condition allows exactly 4 digits. You can change the number according to your need.
$ - Indicates the end of the regex expression.
Note: You can remove the {4} and specify + which means one or more times, or * which means zero or more times, or ? which means once or none.
For more reference please go through this website: https://www.rexegg.com/regex-quickstart.html
Offical regex way
I would use this regex for integers:
^[-1-9]\d*$
This will also work in other programming languages because it's more specific and doesn't make any assumptions about how different programming languages may interpret or handle regex.
Also works in Java
\\d+
Questions regarding ^ and $
As #vikingsteve has pointed out in java, the matches method matches a complete string, not parts of a string. In other words, it is unnecessary to use ^\d+$ (even though it is the official way of regex).
Online regex checkers are more strict and therefore they will behave differently than how Java handles regex.
Try this part of code:
void containsOnlyNumbers(String str)
{
try {
Integer num = Integer.valueOf(str);
System.out.println("is a number");
} catch (NumberFormatException e) {
// TODO: handle exception
System.out.println("is not a number");
}
}
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
JAVA: check a string if there is a special character in it
I'm trying to create a method to check if a password starts or ends with a special character. There were a few other checks that I have managed to code, but this seems a bit more complicated.
I think I need to use regex to do this efficiently. I have already created a method that checks if there are any special characters, but I can't figure out how modify it.
Pattern p = Pattern.compile("\\p{Punct}");
Matcher m = p.matcher(password);
boolean a = m.find();
if (!a)
System.out.println("Password must contain at least one special character!");
According to the book I'm reading I need to use ^ and $ in the pattern to check if it starts or ends with a special character. Can I just add both statements to the existing pattern or how should I start solving this?
EDIT:
Alright, I think I got the non-regex method working:
for (int i = 0; i < password.length(); i++) {
if (SPECIAL_CHARACTERS.indexOf(password.charAt(i)) > 0)
specialCharSum++;
}
Can't you just use charAt to get the character and indexOf to check for whether or not the character is special?
final String SPECIAL_CHARACTERS = "?#"; // And others
if (SPECIAL_CHARACTERS.indexOf(password.charAt(0)) >= 0
|| SPECIAL_CHARACTERS.indexOf(password.charAt(password.length() - 1)) >= 0) {
System.out.println("password begins or ends with a special character");
}
I haven't profiled (profiling is the golden rule for performance), but I would expect iterating through a compile-time constant string to be faster than building and executing a finite-state automaton for a regular expression. Furthermore, Java's regular expressions are more complex than FSAs, so I would expect that Java regular expressions are implemented differently and are thus slower than FSAs.
The simplest approach would be an or with grouping.
Pattern p = Pattern.compile("(^\\p{Punct})|(\\p{Punct}$)");
Matcher m = p.matcher(password);
boolean a = m.find();
if (!a)
System.out.println("Password must contain at least one special character at the beginning or end!");
Use this pattern:
"^\\p{Punct}|\\p{Punct}$"
^\\p{Punct} = "start of string, followed by a punctuation character
| = "or"
\\p{Punct}$ = "punctuation character, followed by end of string"