Recursive replaceAll java [duplicate] - java

This question already has answers here:
Regex to replace repeated characters
(2 answers)
Closed 6 years ago.
I am trying to replace all the repeated characters from a String in Java, and let only one.
For example:
aaaaa ---> a
For that, I have tried using the replaceAll method:
"aaaaa".replaceAll("a*","a") //returns "aa"
I have developed a recursive method, which is probably not very efficient:
public String recursiveReplaceAll(String original,String regex, String replacement) {
if (original.equals(original.replaceAll(regex, replacement))) return original;
return recursiveReplaceAll(original.replaceAll(regex, replacement),regex,replacement);
}
This method works, I was just wondering if there was anything using RegEx for example, which does the work with better performance.

Your replaceAll approach was nearly right - it's just that * matches 0 occurrences. You want + to mean "one or more".
"aaaaa".replaceAll("a+","a") // Returns "a"

You can do it without recursion. The regular expression "(.)\\1+" will capture every character followed by themselves at least once, and it replaces them with the captured character. Thus, this removes any repeated characters.
public static void main(String[] args) {
String str = "aaaabbbaaa";
String result = str.replaceAll("(.)\\1+", "$1");
System.out.println(result); // prints "aba".
}
With this, it works for all characters.

Related

Leetcode Valid Palindrome Question Problem Debugging [duplicate]

This question already has answers here:
String replace method is not replacing characters
(5 answers)
Closed 2 years ago.
I'm struggling to understand what's wrong with my code for this Leetcode problem.
Problem: Given a string, determine if it is a palindrome, considering only alphanumeric characters and ignoring cases.
Right now, I am passing 108/476 cases, and I am failing this test: "A man, a plan, a canal: Panama".
Here is my code, please help me identify the problem!
class Solution {
public boolean isPalindrome(String s) {
if (s.isEmpty()) return true;
s.replaceAll("\\s+","");
int i = 0;
int j = s.length() - 1;
while (i <= j) {
if (Character.toLowerCase(s.charAt(i)) != Character.toLowerCase(s.charAt(j))) {
return false;
}
i++;
j--;
}
return true;
}
}
Your replaceAll method is incorrect
Your replaceAll method currently only removes spaces. It should remove all the special characters and keep only letters. If we use the regex way like you do, this is (one of) the best regex to use:
s = s.replaceAll("[^a-zA-Z]+","");
You could be tempted to use the \W (or [^\w]) instead, but this latest regex matches [a-zA-Z0-9_], including digits and the underscore character. Is this what you want? then go and use \W instead. If not, stick to [^a-zA-Z].
If you want to match all the letters, no matter the language, use the following:
s = s.replace("\\P{L}", "");
Note that you could shorten drastically your code like this, although it's definitely not the fastest:
class Solution {
public boolean isPalindrome(String s) {
s = s.replaceAll("\\P{L}", "");
return new StringBuilder(s).reverse().toString().equalsIgnoreCase(s);
}
}
Your regex is invalid. Try this:
s = s.replaceAll("[\\W]+", "");
\W is used for anything that is not alphanumeric.
By s.replaceAll("\\s+",""); you are only removing the spaces but you also have to remove anything except alphanumeric characters such as punctuation, in this case ,.

replaceAll not working as expected when escaping special characters [duplicate]

This question already has answers here:
Difference between String replace() and replaceAll()
(13 answers)
Closed 4 years ago.
class Ideone
{
public static void main (String[] args) throws java.lang.Exception
{
String searchKeyword="Legal'%_";
String specialChars[]={"_","%","'"};
for(int i=0;i<specialChars.length;i++)
searchKeyword=searchKeyword.replaceAll(specialChars[i],"\\"+specialChars[i]);
System.out.println(searchKeyword);
}
}
This snippet is trying to escape some special characters, but the issue is that searchKeyword is not getting new replaced String.
Its output should be Legal\'\%_, but I am getting the original string only as output.
Please help me in this.
replaceAll(String regex, String replacement) works with a regex :
Replaces each substring of this string that matches the given regular
expression with the given replacement.
What you need to replace a substring in an input String by a specific String is : replace(CharSequence target, CharSequence replacement).
Replaces each substring of this string that matches the literal target
sequence with the specified literal replacement sequence.

Empty Strings within a non empty String [duplicate]

This question already has answers here:
Replace with empty string replaces newChar around all the characters in original string
(4 answers)
Closed 6 years ago.
I'm confused with a code
public class StringReplaceWithEmptyString
{
public static void main(String[] args)
{
String s1 = "asdfgh";
System.out.println(s1);
s1 = s1.replace("", "1");
System.out.println(s1);
}
}
And the output is:
asdfgh
1a1s1d1f1g1h1
So my first opinion was every character in a String is having an empty String "" at both sides. But if that's the case after 'a' (in the String) there should be two '1' coming in the second line of output (one for end of 'a' and second for starting of 's').
Now I checked whether the String is represented as a char[] in these links In Java, is a String an array of chars? and String representation in Java I got answer as YES.
So I tried to assign an empty character '' to a char variable, but its giving me a compiler error,
Invalid character constant
The same process gives a compiler error when I tried in char[]
char[] c = {'','a','','s'}; // CTE
So I'm confused about three things.
How an empty String is represented by char[] ?
Why I'm getting that output for the above code?
How the String s1 is represented in char[] when it is initialized first time?
Sorry if I'm wrong at any part of my question.
Just adding some more explanation to Tim Biegeleisen answer.
As of Java 8, The code of replace method in java.lang.String class is
public String replace(CharSequence target, CharSequence replacement) {
return Pattern.compile(target.toString(), Pattern.LITERAL).matcher(
this).replaceAll(Matcher.quoteReplacement(replacement.toString()));
}
Here You can clearly see that the string is replaced by Regex Pattern matcher and in regex "" is identified by Zero-Length character and it is present around any Non-Zero length character.
So, behind the scene your code is executed as following
Pattern.compile("".toString(), Pattern.LITERAL).matcher("asdfgh").replaceAll(Matcher.quoteReplacement("1".toString()));
The the output becomes
1a1s1d1f1g1h1
Going with Andy Turner's great comment, your call to String#replace() is actually implemented using String#replaceAll(). As such, there is a regex replacement happening here. The matches occurs before the first character, in between each character in the string, and after the last character.
^|a|s|d|f|g|h|$
^ this and every pipe matches to empty string ""
The match you are making is a zero length match. In Java's regex implementation used in String.replaceAll(), this behaves as the example above shows, namely matching each inter-character position and the positions before the first and after the last characters.
Here is a reference which discusses zero length matches in more detail: http://www.regexguru.com/2008/04/watch-out-for-zero-length-matches/
A zero-width or zero-length match is a regular expression match that does not match any characters. It matches only a position in the string. E.g. the regex \b matches between the 1 and , in 1,2.
This is because it does a regex match of the pattern/replacement you pass to the replace().
public String replace(CharSequence target, CharSequence replacement) {
return Pattern.compile(target.toString(), Pattern.LITERAL).matcher(
this).replaceAll(Matcher.quoteReplacement(replacement.toString()));
}
Replaces each substring of this string that matches the literal target
sequence with the specified literal replacement sequence. The
replacement proceeds from the beginning of the string to the end, for
example, replacing "aa" with "b" in the string "aaa" will result in
"ba" rather than "ab".
Parameters:
target The sequence of char values
to be replaced
replacement The replacement sequence of char values
Returns: The resulting string
Throws: NullPointerException if target
or replacement is null.
Since:
1.5
Please read more at the link below ... (Also browse through the source code).
http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/lang/String.java#String.replace%28java.lang.CharSequence%2Cjava.lang.CharSequence%29
A regex such as "" would match every possible empty string in a string. In this case it happens to be every empty space at the start and end and after every character in the string.

Java Regular expressions issue - Can't match two strings in the same line [duplicate]

This question already has answers here:
What do 'lazy' and 'greedy' mean in the context of regular expressions?
(13 answers)
Closed 8 years ago.
just experiencing some problems with Java Regular expressions.
I have a program that reads through an HTML file and replaces any string inside the #VR# characters, i.e. #VR#Test1 2 3 4#VR#
However my issue is that, if the line contains more than two strings surrounded by #VR#, it does not match them. It would match the leftmost #VR# with the rightmost #VR# in the sentence and thus take whatever is in between.
For example:
#VR#Google#VR#
My code would match
URL-GOES-HERE#VR#" target="_blank" style="color:#f4f3f1; text-decoration:none;" title="ContactUs">#VR#Google
Here is my Java code. Would appreciate if you could help me to solve this:
Pattern p = Pattern.compile("#VR#.*#VR#");
Matcher m;
Scanner scanner = new Scanner(htmlContent);
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
m = p.matcher(line);
StringBuffer sb = new StringBuffer();
while (m.find()) {
String match_found = m.group().replaceAll("#VR#", "");
System.out.println("group: " + match_found);
}
}
I tried replacing m.group() with m.group(0) and m.group(1) but nothing. Also m.groupCount() always returns zero, even if there are two matches as in my example above.
Thanks, your help will be very much appreciated.
Your problem is that .* is "greedy"; it will try to match as long a substring as possible while still letting the overall expression match. So, for example, in #VR# 1 #VR# 2 #VR# 3 #VR#, it will match 1 #VR# 2 #VR# 3.
The simplest fix is to make it "non-greedy" (matching as little as possible while still letting the expression match), by changing the * to *?:
Pattern p = Pattern.compile("#VR#.*?#VR#");
Also m.groupCount() always returns zero, even if there are two matches as in my example above.
That's because m.groupCount() returns the number of capture groups (parenthesized subexpressions, whose corresponding matched substrings retrieved using m.group(1) and m.group(2) and so on) in the underlying pattern. In your case, your pattern has no capture groups, so m.groupCount() returns 0.
You can try the regular expression:
#VR#(((?!#VR#).)+)#VR#
Demo:
private static final Pattern REGEX_PATTERN =
Pattern.compile("#VR#(((?!#VR#).)+)#VR#");
public static void main(String[] args) {
String input = "#VR#Google#VR# ";
System.out.println(
REGEX_PATTERN.matcher(input).replaceAll("$1")
); // prints "Google "
}

regular expression to match one or two dots [duplicate]

This question already has answers here:
Java RegEx meta character (.) and ordinary dot?
(9 answers)
Closed 7 years ago.
What is the regular expression for . and .. ?
if(key.matches(".")) {
do something
}
The matches accepts String which asks for regular expression. Now i need to remove all DOT's inside my MAP.
. matches any character so needs escaping i.e. \., or \\. within a Java string (because \ itself has special meaning within Java strings.)
You can then use \.\. or \.{2} to match exactly 2 dots.
...
[.]{1}
or
[.]{2}
?
[+*?.] Most special characters have no meaning inside the square brackets. This expression matches any of +, *, ? or the dot.
Use String.Replace() if you just want to replace the dots from string. Alternative would be to use Pattern-Matcher with StringBuilder, this gives you more flexibility as you can find groups that are between dots. If using the latter, i would recommend that you ignore empty entries with "\\.+".
public static int count(String str, String regex) {
int i = 0;
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(str);
while (m.find()) {
m.group();
i++;
}
return i;
}
public static void main(String[] args) {
int i = 0, j = 0, k = 0;
String str = "-.-..-...-.-.--..-k....k...k..k.k-.-";
// this will just remove dots
System.out.println(str.replaceAll("\\.", ""));
// this will just remove sequences of ".." dots
System.out.println(str.replaceAll("\\.{2}", ""));
// this will just remove sequences of dots, and gets
// multiple of dots as 1
System.out.println(str.replaceAll("\\.+", ""));
/* for this to be more obvious, consider following */
System.out.println(count(str, "\\."));
System.out.println(count(str, "\\.{2}"));
System.out.println(count(str, "\\.+"));
}
The output will be:
--------kkkkk--
-.--.-.-.---kk.kk.k-.-
--------kkkkk--
21
7
11
You should use contains not matches
if(nom.contains("."))
System.out.println("OK");
else
System.out.println("Bad");

Categories