java eclipse regex cant "\+" - java

I need to check a String is "\++?" which will match something like +6014456
But I get this error message invalid escape sequence (valid ones are \b \t \n \f \r \" \' \\) .... why?

It's giving you an error because "\++?" isn't a valid Java literal - you need to escape the backslash. Try this:
Pattern pattern = Pattern.compile("\\++?");
However, I don't think that's actually the regular expression you want. Don't you actually mean something like:
Pattern pattern = Pattern.compile("\\+\\d+");
That corresponds to a regular expression of \+\d+, i.e. a plus followed by at least one digit.

I think you should use two backslashes. One for escaping the second (because it's a java string), the second for escaping the + (because it's a special character for regex).

shouldn't it be more like "\\+?" ?
Pattern pattern = Pattern.compile("\\++?");
System.out.println(pattern.matcher("+9970").find());
works for me

Related

Regex in java: error for str.replace("\s+", " ")

Why the java (1.7) gives me error for the following line?
String str2 = str.replace("\s+", " ");
Error:
Invalid escape sequence (valid ones are \b \t \n \f \r \" \' \\ )
As far as I know "\s+" is a valid regex. Isn't it?
String.replace() will only replace literals, that's the first problem.
The second problem is that \s is not a valid escape sequence in a Java string literal, by definition.
Which means what you wanted was probably "\\s+".
But even then, .replace() won't take that as a regex. You have to use .replaceAll() instead:
s.replaceAll("\\s+", "");
BUT there is another problem. You seem to be using it often... Therefore, use a Pattern instead:
private static final Pattern SPACES = Pattern.compile("\\s+");
// In code...
SPACES.matcher(input).replaceAll("");
FURTHER NOTES:
If what you want is to only replace the first occurrence, then use .replaceFirst(); String has it, and so does Pattern
When you .replace{First,All}() on a String, a new Pattern is recompiled for each and every invocation. Use a Pattern if you have to do repetitive matches!
It's a valid regular expression pattern, but \s is not a valid String literal escape sequence. Escape the \.
String str2 = str.replace("\\s+", " ");
As suggested, String#replace(CharSequence, CharSequence) doesn't consider the arguments you provide as regular expressions. So even if you got the program to compile, it wouldn't do what you seem to want it to do. Check out String#replaceAll(String, String).

Regex to match \a574322 in Java

I have long string looking like this: \c53\e59\c9\e28\c20140326\a4095\c8\c15\a546\c11 and I need to find expressions starting with \a and followed by digits. For example: \a574322
And I have no idea how to build it. I can't use:
Pattern p = Pattern.compile("\\a\\d*");
because \a is special character in regex.
When I try to group it like this:
Pattern p = Pattern.compile("(\\)(a)(\\d)*");
I get unclosed group error even though there is even number of brackets.
Can you help me with this?
Thank you all very much for solution.
You can use this regex:
\\\\a\\d+
Code Demo
Since in Java you need to double escape the \\ once for String and second time for regex engine.
You have to change your regex to:
Pattern p = Pattern.compile("(\\\\a\\d+)");
The regex is:
(\\a\d+)
The idea is to escape a backslash and then also escape the backslash for \a, and match digits too.
You need 4 \.
2 to indicate to regex that it is not a special character, but a plain \, and 2 for each to tell the Java String that these are not special characters either. So you need to represent it in code this way:
"\\\\a\\d*"
Which is actually the regex \\a\d*
\\(a)[0-9]+ this should work
you can't try your regexps on this page or some similar
http://regex101.com/

Pattern Matching failed when "\" is input

My pattern is something like this:
"^[a-zA-Z0-9_'^&/+-\\.]{1,}#{1,1}[a-zA-Z0-9_'^&/+-.]{1,}$"
But when I try to match something with a backslash in it, like this:
"abc\\#abc"
...it does not match. Can anyone explain why?
try with below pattern
"^[a-zA-Z0-9_'^&/+-\\\\.]{1,}#{1,1}[a-zA-Z0-9_'^&/+-.]{1,}$";
or
"^[a-zA-Z0-9_'^&/+-\\{0,}}.]{1,}#{1,1}[a-zA-Z0-9_'^&/+-.]{1,}$";
The expression \\ matches a single backslash \
Try escaping each backslash of your test string with an additional backslash: e.g.
"abc\\\\#abc" becomes "abc\\\\\\\\#abc"
You need to use "\\\\" if you want the end result to look like "\"
why, you ask?
The Java compiler sees the string "\\\\" and turns that into "\\" as "\" is an escape character.
Afterwards the regular expression sees the string "\\" and turns it into "\" as an "\" is an escape character.
so to want a single backslash you must put in four.
I'm assuming you're writing the regex in your Java source code, like this:
Pattern p = Pattern.compile(
"^[a-zA-Z0-9_'^&/+-\\.]{1,}#{1,1}[a-zA-Z0-9_'^&/+-.]{1,}$"
);
I'm also assuming you meant \\. as a backslash followed by a dot, not as an escaped dot.
Because it's in a string literal, you have to escape backslashes one more time. That means you have to use four backslashes in the regex to match one in the target string. You also need to escape the - (hyphen) so the regex compiler doesn't think (for example) that [+-.] is meant to be a range expression like [0-9] or [a-z].
"^[a-zA-Z0-9_'^&/+\\\\.-]+#[a-zA-Z0-9_'^&/+.-]+$"
I also changed your {1,} to + because it means the same thing, and got rid of the {1,1} because it doesn't do anything. And I changed your & to &. I don't know how that got in there, but if you wrote it that way in your source code, it's wrong.

How to escape characters in a regular expression

When I use the following code I've got an error:
Matcher matcher = pattern.matcher("/Date\(\d+\)/");
The error is :
invalid escape sequence (valid ones are \b \t \n \f \r \" \' \\ )
I have also tried to change the value in the brackets to('/Date\(\d+\)/'); without any success.
How can i avoid this error?
You need to double-escape your \ character, like this: \\.
Otherwise your String is interpreted as if you were trying to escape (.
Same with the other round bracket and the d.
In fact it seems you are trying to initialize a Pattern here, while pattern.matcher references a text you want your Pattern to match.
Finally, note that in a Pattern, escaped characters require a double escape, as such:
\\(\\d+\\)
Also, as Rohit says, Patterns in Java do not need to be surrounded by forward slashes (/).
In fact if you initialize a Pattern like that, it will interpret your Pattern as starting and ending with literal forward slashes.
Here's a small example of what you probably want to do:
// your input text
String myText = "Date(123)";
// your Pattern initialization
Pattern p = Pattern.compile("Date\\(\\d+\\)");
// your matcher initialization
Matcher m = p.matcher(myText);
// printing the output of the match...
System.out.println(m.find());
Output:
true
Your regex is correct by itself, but in Java, the backslash character itself needs to be escaped.
Thus, this regex:
/Date\(\d+\)/
Must turn into this:
/Date\\(\\d+\\)/
One backslash is for escaping the parenthesis or d. The other one is for escaping the backslash itself.
The error message you are getting arises because Java thinks you're trying to use \( as a single escape character, like \n, or any of the other examples. However, \( is not a valid escape sequence, and so Java complains.
In addition, the logic of your code is probably incorrect. The argument to matcher should be the text to search (for example, "/Date(234)/Date(6578)/"), whereas the variable pattern should contain the pattern itself. Try this:
String textToMatch = "/Date(234)/Date(6578)/";
Pattern pattern = pattern.compile("/Date\\(\\d+\\)/");
Matcher matcher = pattern.matcher(textToMatch);
Finally, the regex character class \d means "one single digit." If you are trying to refer to the literal phrase \\d, you would have to use \\\\d to escape this. However, in that case, your regex would be a constant, and you could use textToMatch.indexOf and textToMatch.contains more easily.
To escape regex in java, you can also use Pattern.quote()

Is there another way to do a regex without a String escaping all characters?

I have this line of code to remove some punctuation:
str.replaceAll("[\\-\\!\\?\\.\\,\\;\\:\\\"\\']", "");
I don't know if all the chars in this regex need to be escaped, but I escaped only for safety.
Is there some way to build a regex like this in a more clear way?
Inside [...] you don't need to escape the characters. [.] for instance wouldn't make sense anyway!
The exceptions to the rule are
] since it would close the whole [...] expression prematurely.
^ if it is the first character, since [^abc] matches everything except abc.
- unless it's the first/last character, since [a-z] matches all characters between a to z.
Thus, you could write
str.replaceAll("[-!?.,;:\"']", "")
To quote a string into a regular expression, you could also use Pattern.quote which escapes the characters in the string as necessary.
Demo:
String str = "abc-!?.,;:\"'def";
System.out.println(str.replaceAll("[-!?.,;:\"']", "")); // prints abcdef
You might need to escape the double-quotes because you have the string in double-quotes; but as aioobe says, don't escape the rest. Put the - at the end of the group, however.

Categories