Why replaceAll("$","") is not working although replace("$","") works just fine? - java

import java.util.*;
import java.lang.*;
import java.io.*;
class GFG
{
public static void main (String[] args)
{
int turns;
Scanner scan=new Scanner(System.in);
turns=scan.nextInt();
while(turns-->0)
{
String pattern=scan.next();
String text=scan.next();
System.out.println(regex(pattern,text));
}
}//end of main method
static int regex(String pattern,String text)
{
if(pattern.startsWith("^"))
{
if(text.startsWith(pattern.replace("^","")))
return 1;
}
else if(pattern.endsWith("$"))
{
if(text.endsWith(pattern.replace("$","")))
return 1;
}
else
{
if(text.contains(pattern))
return 1;
}
return 0;
}
}
Input:
2
or$
hodor
or$
arya
Output:
1
0
In this program i am scanning two parameters(String) in which first one is pattern and second one is text in which i have to find pattern. Method should return 1 if pattern matched else return 0.
While using replace it is working fine but when i replace replace() to replaceAll() it is not working properly as expected.
How can i make replaceAll() work in this program.

Because replaceAll expects a string defining a regular expression, and $ means "end of line" in regular expressions. From the link:
public String replaceAll(String regex,
String replacement)
Replaces each substring of this string that matches the given regular expression with the given replacement.
You need to escape it with a backslash (which also has to be escaped, in the string literal):
if(text.endsWith(pattern.replaceAll("\\$","")))
For complex strings that you want to replace verbatim, Pattern.quote is useful:
if(text.endsWith(pattern.replaceAll(Pattern.quote("$"),"")))
You don't need it here because your replacement is "", but if your replacement may have special characters in it (like backslashes or dollar signs), use Matcher.quoteReplacement on the replacement string as well.

$ is a scpecial character in regex (EOL). You have to escape it
pattern.replaceAll("\\$","")

Despite the similar name, these are two very different methods.
replace replaces substrings with other substrings (*).
replaceAll uses regular expression matching, and $ is a special control character there (meaning "end of string/line").
You should not be using replaceAll here, but if you must, you have to quote the $:
pattern.replaceAll(Pattern.quote("$"),"")
(*) to make things more confusing, replace also replaces all occurances, so the only difference in the method names does not all describe the difference in function.

Introducing another level of complexity by replacing $ by \$.
"$ABC$AB".replaceAll(Matcher.quoteReplacement("$"), Matcher.quoteReplacement("\\\\$"))
// Output - \\$ABC\\$AB
This worked for me.
For the issue reported here,
"$ABC$AB".replaceAll(Matcher.quoteReplacement("$"), "")
should work.

Related

How to match two string using java Regex

String 1= abc/{ID}/plan/{ID}/planID
String 2=abc/1234/plan/456/planID
How can I match these two strings using Java regex so that it returns true? Basically {ID} can contain anything. Java regex should match abc/{anything here}/plan/{anything here}/planID
If your "{anything here}" includes nothing, you can use .*. . matches any letter, and * means that match the string with any length with the letter before, including 0 length. So .* means that "match the string with any length, composed with any letter". If {anything here} should include at least one letter, you can use +, instead of *, which means almost the same, but should match at least one letter.
My suggestion: abc/.+/plan/.+/planID
If {ID} can contain anything I assume it can also be empty.
So this regex should work :
str.matches("^abc.*plan.*planID$");
^abc at the beginning
.* Zero or more of any Character
planID$ at the end
I am just writing a small code, just check it and start making changes as per you requirement. This is working, check for your other test cases, if there is any issue please comment that test case. Specifically I am using regex, because you want to match using java regex.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
class MatchUsingRejex
{
public static void main(String args[])
{
// Create a pattern to be searched
Pattern pattern = Pattern.compile("abc/.+/plan/.+/planID");
// checking, Is pattern match or not
Matcher isMatch = pattern.matcher("abc/1234/plan/456/planID");
if (isMatch.find())
System.out.println("Yes");
else
System.out.println("No");
}
}
If line always starts with 'abc' and ends with 'planid' then following way will work:
String s1 = "abc/{ID}/plan/{ID}/planID";
String s2 = "abc/1234/plan/456/planID";
String pattern = "(?i)abc(?:/\\S+)+planID$";
boolean b1 = s1.matches(pattern);
boolean b2 = s2.matches(pattern);

Regular Expression for ")" matching parentheses

Every smiling face must have a smiling mouth that should be marked with either ) or D.
I tried to do this using the following code:
import java.util.*;
import java.util.regex.Pattern;
public class SmileFaces {
public static int countSmileys(List<String> arr) {
String regx = "/^((:|;)(-|~)?|D|//))$/";
int count=0;
ListIterator<String> itr=arr.listIterator();
while(itr.hasNext()){
if(Pattern.matches(regx,itr.next())){
count++;
}
}
return count;
}
}
I have tried this regex for smiling checking: /^((:|;)(-|~)?|D|//))$/
You could just patch your current regex by correctly escaping \\) with two backslashes, but I think character classes are easier to read here:
String regx = "^[;:][~-]?[D)]$";
Note that Java regex patterns do not take delimiters as they would in another language such as PHP or Python, so I removed them from your pattern. Also, if you wanted to use the above pattern with certain methods, such as String#matches, you could remove the ^ and $ anchors.

java regular expression split pattern

I want to split the following string:
String line ="DOB,1234567890,11,07/05/12,\"first,last\",100,\"is,a,good,boy\"";
into following tokens:
DOB
1234567890
11
07/05/12
first,last
100
is,a,good,boy
I tried using following regular expression:
import java.util.*;
import java.lang.*;
import java.util.regex.*;
import org.apache.commons.lang.StringUtils;
class SplitString{
public static final String quotes = "\".[[((a-z)|(A-Z))]+( ((a-z)|(A-Z)).,)*.((a-z)|(A-Z))].\"" ;
public static final String ISSUE_UPLOAD_FILE_PATTERN = "((a-z)|(A-Z))+ [(((a-z)|(A-Z)).,)* + ("+quotes+".,) ].((a-z)|(A-Z)) + ("+quotes+")";
public static void main(String[] args){
String line ="DOB,1234567890,11,07/05/12,\"first,last\",100,\"is,a,good,boy\"";
String delimiter = ",";
Pattern p = Pattern.compile(ISSUE_UPLOAD_FILE_PATTERN);
Pattern pattern = Pattern.compile(ISSUE_UPLOAD_FILE_PATTERN);
String[] output = pattern.split(line);
System.out.println(" pattern: "+pattern);
for(String a:output){
System.out.println(" output: "+a);
}
}
}
Am I missing anything in the regular expression?
This is an updated version of your code that gives you your expected output:
public static final String ISSUE_UPLOAD_FILE_PATTERN = "(?<=(^|,))(([^\",]+)|\"([^\"]*)\")(?=($|,))";
public static void main(String[] args) {
String line = "DOB,1234567890,11,07/05/12,\"first,last\",100,\"is,a,good,boy\"";
Matcher matcher = Pattern.compile(ISSUE_UPLOAD_FILE_PATTERN).matcher(line);
while (matcher.find()) {
if (matcher.group(3) != null) {
System.out.println(matcher.group(3));
} else {
System.out.println(matcher.group(4));
}
}
}
The regex works like this:
(?<=(^|,)): Check that the character before the match is start of string or a ,
(([^\",]+)|\"([^\"]*)\"): Match either "<any number of (not")>" or any number of (not" or ,)
(?=($|,)): Check that the character after the match is end of string or a ,
The result will be i either group 3 or 4 depending on which part matched.
Your regular expressions do some weird stuff with [ and ]: the use of these doesn't look at all like character ranges. For this reason, I didn't bother to decypher and fix all of your expression.
As a second note, you should make sure what your regular expressions should describe: do you want them to match the delimiter between tokens, or each individual non-delimiter token? Use of the split method implies the former, but I guess for your application, the latter is easier to achieve. In fact in a recent answer of mine I came up with a regular expression matching tokens of a csv file:
String tokenPattern = "\"[^\"]*(\"\"[^\"]*)*\"|[^,]*";
This will match
unquoted strings up to but not including the next comma
qutoed strings up to the closing quote, including embedded commas
quoted strings including double quotes
You can use this, create a matcher for your line, iterate over all matches using find and extract the token using group(). You could alkso use that loop to strip quotes and transform double quotes to single quotes, if you need the semantic value of the column.
As an alternative, you could of course also use a CSV reader as suggested in comments to your question.

How to replace last dot in a string using a regular expression?

I'm trying to replace the last dot in a String using a regular expression.
Let's say I have the following String:
String string = "hello.world.how.are.you!";
I want to replace the last dot with an exclamation mark such that the result is:
"hello.world.how.are!you!"
I have tried various expressions using the method String.replaceAll(String, String) without any luck.
One way would be:
string = string.replaceAll("^(.*)\\.(.*)$","$1!$2");
Alternatively you can use negative lookahead as:
string = string.replaceAll("\\.(?!.*\\.)","!");
Regex in Action
Although you can use a regex, it's sometimes best to step back and just do it the old-fashioned way. I've always been of the belief that, if you can't think of a regex to do it in about two minutes, it's probably not suited to a regex solution.
No doubt get some wonderful regex answers here. Some of them may even be readable :-)
You can use lastIndexOf to get the last occurrence and substring to build a new string: This complete program shows how:
public class testprog {
public static String morph (String s) {
int pos = s.lastIndexOf(".");
if (pos >= 0)
return s.substring(0,pos) + "!" + s.substring(pos+1);
return s;
}
public static void main(String args[]) {
System.out.println (morph("hello.world.how.are.you!"));
System.out.println (morph("no dots in here"));
System.out.println (morph(". first"));
System.out.println (morph("last ."));
}
}
The output is:
hello.world.how.are!you!
no dots in here
! first
last !
The regex you need is \\.(?=[^.]*$). the ?= is a lookahead assertion
"hello.world.how.are.you!".replace("\\.(?=[^.]*$)", "!")
Try this:
string = string.replaceAll("[.]$", "");

Regular Expression problem in Java

I am trying to create a regular expression for the replaceAll method in Java. The test string is abXYabcXYZ and the pattern is abc. I want to replace any symbol except the pattern with +. For example the string abXYabcXYZ and pattern [^(abc)] should return ++++abc+++, but in my case it returns ab++abc+++.
public static String plusOut(String str, String pattern) {
pattern= "[^("+pattern+")]" + "".toLowerCase();
return str.toLowerCase().replaceAll(pattern, "+");
}
public static void main(String[] args) {
String text = "abXYabcXYZ";
String pattern = "abc";
System.out.println(plusOut(text, pattern));
}
When I try to replace the pattern with + there is no problem - abXYabcXYZ with pattern (abc) returns abxy+xyz. Pattern (^(abc)) returns the string without replacement.
Is there any other way to write NOT(regex) or group symbols as a word?
What you are trying to achieve is pretty tough with regular expressions, since there is no way to express “replace strings not matching a pattern”. You will have to use a “positive” pattern, telling what to match instead of what not to match.
Furthermore, you want to replace every character with a replacement character, so you have to make sure that your pattern matches exactly one character. Otherwise, you will replace whole strings with a single character, returning a shorter string.
For your toy example, you can use negative lookaheads and lookbehinds to achieve the task, but this may be more difficult for real-world examples with longer or more complex strings, since you will have to consider each character of your string separately, along with its context.
Here is the pattern for “not ‘abc’”:
[^abc]|a(?!bc)|(?<!a)b|b(?!c)|(?<!ab)c
It consists of five sub-patterns, connected with “or” (|), each matching exactly one character:
[^abc] matches every character except a, b or c
a(?!bc) matches a if it is not followed by bc
(?<!a)b matches b if it is not preceded with a
b(?!c) matches b if it is not followed by c
(?<!ab)c matches c if it is not preceded with ab
The idea is to match every character that is not in your target word abc, plus every word character that, according to the context, is not part of your word. The context can be examined using negative lookaheads (?!...) and lookbehinds (?<!...).
You can imagine that this technique will fail once you have a target word containing one character more than once, like example. It is pretty hard to express “match e if it is not followed by x and not preceded by l”.
Especially for dynamic patterns, it is by far easier to do a positive search and then replace every character that did not match in a second pass, as others have suggested here.
[^ ... ] will match one character that is not any of ...
So your pattern "[^(abc)]" is saying "match one character that is not a, b, c or the left or right bracket"; and indeed that is what happens in your test.
It is hard to say "replace all characters that are not part of the string 'abc'" in a single trivial regular expression. What you might do instead to achieve what you want could be some nasty thing like
while the input string still contains "abc"
find the next occurrence of "abc"
append to the output a string containing as many "+"s as there are characters before the "abc"
append "abc" to the output string
skip, in the input string, to a position just after the "abc" found
append to the output a string containing as many "+"s as there are characters left in the input
or possibly if the input alphabet is restricted you could use regular expressions to do something like
replace all occurrences of "abc" with a single character that does not occur anywhere in the existing string
replace all other characters with "+"
replace all occurrences of the target character with "abc"
which will be more readable but may not perform as well
Negating regexps is usually troublesome. I think you might want to use negative lookahead. Something like this might work:
String pattern = "(?<!ab).(?!abc)";
I didn't test it, so it may not really work for degenerate cases. And the performance might be horrible too. It is probably better to use a multistep algorithm.
Edit: No I think this won't work for every case. You will probably spend more time debugging a regexp like this than doing it algorithmically with some extra code.
Try to solve it without regular expressions:
String out = "";
int i;
for(i=0; i<text.length() - pattern.length() + 1; ) {
if (text.substring(i, i + pattern.length()).equals(pattern)) {
out += pattern;
i += pattern.length();
}
else {
out += "+";
i++;
}
}
for(; i<text.length(); i++) {
out += "+";
}
Rather than a single replaceAll, you could always try something like:
#Test
public void testString() {
final String in = "abXYabcXYabcHIH";
final String expected = "xxxxabcxxabcxxx";
String result = replaceUnwanted(in);
assertEquals(expected, result);
}
private String replaceUnwanted(final String in) {
final Pattern p = Pattern.compile("(.*?)(abc)([^a]*)");
final Matcher m = p.matcher(in);
final StringBuilder out = new StringBuilder();
while (m.find()) {
out.append(m.group(1).replaceAll(".", "x"));
out.append(m.group(2));
out.append(m.group(3).replaceAll(".", "x"));
}
return out.toString();
}
Instead of using replaceAll(...), I'd go for a Pattern/Matcher approach:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
public static String plusOut(String str, String pattern) {
StringBuilder builder = new StringBuilder();
String regex = String.format("((?:(?!%s).)++)|%s", pattern, pattern);
Matcher m = Pattern.compile(regex).matcher(str.toLowerCase());
while(m.find()) {
builder.append(m.group(1) == null ? pattern : m.group().replaceAll(".", "+"));
}
return builder.toString();
}
public static void main(String[] args) {
String text = "abXYabcXYZ";
String pattern = "abc";
System.out.println(plusOut(text, pattern));
}
}
Note that you'll need to use Pattern.quote(...) if your String pattern contains regex meta-characters.
Edit: I didn't see a Pattern/Matcher approach was already suggested by toolkit (although slightly different)...

Categories