What is the regular expression for matching '(' in a string?
Following is the scenario :
I have a string
str = "abc(efg)";
I want to split the string at '(' using regular expression.For that i am using
Arrays.asList(Pattern.compile("/(").split(str))
But i am getting the following exception.
java.util.regex.PatternSyntaxException: Unclosed group near index 2
/(
Escaping '(' doesn't seems to work.
Two options:
Firstly, you can escape it using a backslash -- \(
Alternatively, since it's a single character, you can put it in a character class, where it doesn't need to be escaped -- [(]
The solution consists in a regex pattern matching open and closing parenthesis
String str = "Your(String)";
// parameter inside split method is the pattern that matches opened and closed parenthesis,
// that means all characters inside "[ ]" escaping parenthesis with "\\" -> "[\\(\\)]"
String[] parts = str.split("[\\(\\)]");
for (String part : parts) {
// I print first "Your", in the second round trip "String"
System.out.println(part);
}
Writing in Java 8's style, this can be solved in this way:
Arrays.asList("Your(String)".split("[\\(\\)]"))
.forEach(System.out::println);
I hope it is clear.
You can escape any meta-character by using a backslash, so you can match ( with the pattern
\(.
Many languages come with a build-in escaping function, for example, .Net's Regex.Escape or Java's Pattern.quote
Some flavors support \Q and \E, with literal text between them.
Some flavors (VIM, for example) match ( literally, and require \( for capturing groups.
See also: Regular Expression Basic Syntax Reference
For any special characters you should use '\'.
So, for matching parentheses - /\(/
Because ( is special in regex, you should escape it \( when matching. However, depending on what language you are using, you can easily match ( with string methods like index() or other methods that enable you to find at what position the ( is in. Sometimes, there's no need to use regex.
Related
I want to fetch
http://d1oiazdc2hzjcz.cloudfront.net/promotions/precious/2x/p_608_o_6288_precious_image_1419866866.png
from
url(http://d1oiazdc2hzjcz.cloudfront.net/promotions/precious/2x/p_608_o_6288_precious_image_1419866866.png)
I have tried this code:
String a = "";
Pattern pattern = Pattern.compile("url(.*)");
Matcher matcher = pattern.matcher(imgpath);
if (matcher.find()) {
a = (matcher.group(1));
}
return a;
but a == (http://d1oiazdc2hzjcz.cloudfront.net/promotions/precious/2x/p_639_o_4746_precious_image_1419867529.png)
how can I fine tune it?
Why use a regular expression to begin with?
Given
final String s = "url(http://d1oiazdc2hzjcz.cloudfront.net/promotions/precious/2x/p_608_o_6288_precious_image_1419866866.png)";
If the string is always the same format a simple substring(4,s.length()-1) would be better.
That said, if you insist on a regular expression:
You have to escape the ( with \( so in Java ( you have to escape the \ ) it would be \\( same with the ).
Then you can get the grouping with url\\((.+)\\), test it here!
Learn to use RegEx101.com before coming here, it will point out errors like this immediately.
As you already seem to know ( and )` represents groups which means that in regex
url(.*)
(.*) will place everything after url in group 1, which in case of
url(http://d1oiazdc2hzjcz.cloudfront.net/promotions/precious/2x/p_608_o_6288_precious_image_1419866866.png)
will be
(http://d1oiazdc2hzjcz.cloudfront.net/promotions/precious/2x/p_608_o_6288_precious_image_1419866866.png)
If you want to exclude ( and ) from match you need to add their literals to regex, which means you need to escape them. There are many things to do it, like adding \ before each of them, or surrounding them with [ ].
Other problem with your regex is that .* finds maximal potential match but since . represents any character (except line separators) it can also include ( and ). To solve this problem you can make * quantifier reluctant by adding ? after it so your final regex can be written as string
"url\\((.*?)\\)"
---------------
url
\\( - ( literal
(.*?) - group 1
\\) - ) literal
or you can use instead of . character class which will accept all characters except ) like
"url\\(([^)]*)\\)"
Try this regex:
url\((.*?)\)
The outermost parentheses are escaped so they will be matched literally. The inner parentheses are for capturing a group. The question mark after the .* is to make the match lazy, so the first closing parenthesis found will end the group.
Note that to use this regex in Java, you'll have to additionally escape the backslashes in order to express the above regex as a string literal:
String regex = "url\\((.*?)\\)";
You need to escape the () to match the parenthesis in the string, and then add another set of () around the part you want to pull out in group 1, the actual url. I also changed the part inside the parenthesis to [^)]*, which will match everything until it finds a ). See below:
url\(([^)]*)\)
i have a string where i want to get rid of brackets
this is my string "(name)"
and i want to get "name"
the same thing without the brackets
i had String s = "(name)";
i wrote
s = s.replaceAll("(","");
s = s.replaceAll(")","");
and i get an exception for that
Exception in thread "main" java.util.regex.PatternSyntaxException: Unclosed group near index 1
(
how do i get rid of the brackets?
Parenthesis characters ( and ) delimit the bounds of a capturing group in a regular expression which is used as the first argument in replaceAll. The characters need to be escaped.
s = s.replaceAll("\\(","");
s = s.replaceAll("\\)","");
Better yet, you could simply place the parenthesis in a character class to prevent the characters being interpreted as meta-characters
s = s.replaceAll("[()]","");
s = s.replace("(", "").replace(")", "");
Regex isn't needed here.
If you wanted to use Regex (not sure why you would) you could do something like this:
s = s.replaceAll("\\(", "").replaceAll("\\)", "");
The problem was that ( and ) are meta characters so you need to escape them (assuming you want them to be interpreted as how they appear).
String#replaceAll takes regular expression as argument.
You are using Grouping Meta-characters as regular expression argument.That is why getting error.
Meta-characters are used to group, divide, and perform special operations in patterns.
\ Escape the next meta-character (it becomes a normal/literal character)
^ Match the beginning of the line
. Match any character (except newline)
$ Match the end of the line (or before newline at the end)
| Alternation (‘or’ statement)
() Grouping
[] Custom character class
So use
1.\\( instead of (
2. \\) instead of )
You'll need to escape the brackets like this:
s = s.replaceAll("\\(","");
s = s.replaceAll("\\)","");
You need two slashes since the regex processing engine would need to see a \( to process the bracket as a literal bracket (and not as part of the regex expression), and you'll need to escape the backslash so the regex engine would be able to see it as a backslash.
You need to escape the ( and the ) they have special string literal meaning.
Do it like this:
s = s.replaceAll("\\(","");
s = s.replaceAll("\\)","");
s=s.replace("(","").replace(")","");
I want to split a string when following of the symbols encounter "+,-,*,/,="
I am using split function but this function can take only one argument.Moreover it is not working on "+".
I am using following code:-
Stringname.split("Symbol");
Thanks.
String.split takes a regular expression as argument.
This means you can alternate whatever symbol or text abstraction in one parameter in order to split your String.
See documentation here.
Here's an example in your case:
String toSplit = "a+b-c*d/e=f";
String[] splitted = toSplit.split("[-+*/=]");
for (String split: splitted) {
System.out.println(split);
}
Output:
a
b
c
d
e
f
Notes:
Reserved characters for Patterns must be double-escaped with \\. Edit: Not needed here.
The [] brackets in the pattern indicate a character class.
More on Patterns here.
You can use a regular expression:
String[] tokens = input.split("[+*/=-]");
Note: - should be placed in first or last position to make sure it is not considered as a range separator.
You need Regular Expression. Addionaly you need the regex OR operator:
String[]tokens = Stringname.split("\\+|\\-|\\*|\\/|\\=");
For that, you need to use an appropriate regex statement. Most of the symbols you listed are reserved in regex, so you'll have to escape them with \.
A very baseline expression would be \+|\-|\\|\*|\=. Relatively easy to understand, each symbol you want is escaped with \, and each symbol is separated by the | (or) symbol. If, for example, you wanted to add ^ as well, all you would need to do is append |\^ to that statement.
For testing and quick expressions, I like to use www.regexpal.com
I have a list of strings that contains tokens.
Token is:
{ARG:token_name}.
I also have hash map of tokens, where key is the token and value is the value I want to substitute the token with.
When I use "replaceAll" method I get error:
java.util.regex.PatternSyntaxException: Illegal repetition
My code is something like this:
myStr.replaceAll(valueFromHashMap , "X");
and valueFromHashMap contains { and }.
I get this hashmap as a parameter.
String.replaceAll() works on regexps. {n,m} is usually repetition in regexps.
Try to use \\{ and \\} if you want to match literal brackets.
So replacing all opening brackets by X works that way:
myString.replaceAll("\\{", "X");
See here to read about regular expressions (regexps) and why { and } are special characters that have to be escaped when using regexps.
As others already said, { is a special character used in the pattern (} too).
You have to escape it to avoid any confusion.
Escaping those manually can be dangerous (you might omit one and make your pattern go completely wrong) and tedious (if you have a lot of special characters).
The best way to deal with this is to use Pattern.quote()
Related issues:
How to escape a square bracket for Pattern compilation
How to escape text for regular expression in Java
Resources:
Oracle.com - JavaSE tutorial - Regular Expressions
replaceAll() takes a regular expression as a parameter, and { is a special character in regular expressions. In order for the regex to treat it as a regular character, it must be escaped by a \, which must be escaped again by another \ in order for Java to accept it. So you must use \\{.
You can remove the curly brackets with .replaceAll() in a line with square brackets
String newString = originalString.replaceAll("[{}]", "X")
eg: newString = "ARG:token_name"
if you want to further separate newString to key and value, you can use .split()
String[] arrayString = newString.split(":")
With arrayString, you can use it for your HashMap with .put(), arrayString[0] and arrayString[1]
I need 2 simple reg exps that will:
Match if a string is contained within square brackets ([] e.g [word])
Match if string is contained within double quotes ("" e.g "word")
\[\w+\]
"\w+"
Explanation:
The \[ and \] escape the special bracket characters to match their literals.
The \w means "any word character", usually considered same as alphanumeric or underscore.
The + means one or more of the preceding item.
The " are literal characters.
NOTE: If you want to ensure the whole string matches (not just part of it), prefix with ^ and suffix with $.
And next time, you should be able to answer this yourself, by reading regular-expressions.info
Update:
Ok, so based on your comment, what you appear to be wanting to know is if the first character is [ and the last ] or if the first and last are both " ?
If so, these will match those:
^\[.*\]$ (or ^\\[.*\\]$ in a Java String)
"^.*$"
However, unless you need to do some special checking with the centre characters, simply doing:
if ( MyString.startsWith("[") && MyString.endsWith("]") )
and
if ( MyString.startsWith("\"") && MyString.endsWith("\"") )
Which I suspect would be faster than a regex.
Important issues that may make this hard/impossible in a regex:
Can [] be nested (e.g. [foo [bar]])? If so, then a traditional regex cannot help you. Perl's extended regexes can, but it is probably better to write a parser.
Can [, ], or " appear escaped (e.g. "foo said \"bar\"") in the string? If so, see How can I match double-quoted strings with escaped double-quote characters?
Is it possible for there to be more than one instance of these in the string you are matching? If so, you probably want to use the non-greedy quantifier modifier (i.e. ?) to get the smallest string that matches: /(".*?"|\[.*?\])/g
Based on comments, you seem to want to match things like "this is a "long" word"
#!/usr/bin/perl
use strict;
use warnings;
my $s = 'The non-string "this is a crazy "string"" is bad (has own delimiter)';
print $s =~ /^.*?(".*").*?$/, "\n";
Are they two separate expressions?
[[A-Za-z]+]
\"[A-Za-z]+\"
If they are in a single expression:
[[\"]+[a-zA-Z]+[]\"]+
Remember that in .net you'll need to escape the double quotes " by ""