Splitting a string in java on more than one symbol - java

I want to split a string when following of the symbols encounter "+,-,*,/,="
I am using split function but this function can take only one argument.Moreover it is not working on "+".
I am using following code:-
Stringname.split("Symbol");
Thanks.

String.split takes a regular expression as argument.
This means you can alternate whatever symbol or text abstraction in one parameter in order to split your String.
See documentation here.
Here's an example in your case:
String toSplit = "a+b-c*d/e=f";
String[] splitted = toSplit.split("[-+*/=]");
for (String split: splitted) {
System.out.println(split);
}
Output:
a
b
c
d
e
f
Notes:
Reserved characters for Patterns must be double-escaped with \\. Edit: Not needed here.
The [] brackets in the pattern indicate a character class.
More on Patterns here.

You can use a regular expression:
String[] tokens = input.split("[+*/=-]");
Note: - should be placed in first or last position to make sure it is not considered as a range separator.

You need Regular Expression. Addionaly you need the regex OR operator:
String[]tokens = Stringname.split("\\+|\\-|\\*|\\/|\\=");

For that, you need to use an appropriate regex statement. Most of the symbols you listed are reserved in regex, so you'll have to escape them with \.
A very baseline expression would be \+|\-|\\|\*|\=. Relatively easy to understand, each symbol you want is escaped with \, and each symbol is separated by the | (or) symbol. If, for example, you wanted to add ^ as well, all you would need to do is append |\^ to that statement.
For testing and quick expressions, I like to use www.regexpal.com

Related

Validating a mathematical expression in java

I am trying to validate if the string "expression" as in the code below is a formula.
String expression = request.getParameter(FORMULA);
if(!Pattern.matches("[a-zA-Z0-9+-*/()]", expression)){return new AjaxMessage(AjaxMessage.ResponseStatusEnum.FAILURE, getJsonString(, "Manager.invalid.formula" , null));
}
examples of value for expression are {a+b/2, (a+b)*2,(john-Max),etc} just for the context (the variable names in the formula might vary and the arithmetic expression contains only [+-/()*] special characters. As you can see I tried to validate using regex (new to regex), but I think it's not possible as I don't know the length of the variable names.
Is there a way to achieve a validation using regex or any other library in java?
Thanks in advance.
The reason is you are using characters with special meaning in regex. You need to escape those characters. I have just modified yor regex to make it work.
Code:
List<String> expressions = new ArrayList<String>();
expressions.add("a+b/2");
expressions.add("(a+b)*2");
expressions.add("john-Max");
expressions.add("etc[");
for (String expression : expressions) {
if (!Pattern.matches("[a-zA-Z0-9\\+\\-\\*/\\(\\)]*", expression)) {
System.out.println("NOT match");
} else {
System.out.println("MATCH");
}
}
}
OUTPUT:
MATCH
MATCH
MATCH
NOT match
You're using special character in your regex, you need to escape them using \.
It should look like [a-zA-Z0-9+\\-*/()] . This only tests one character you need to add a * at the end to test multiple characters.
Edit (thanks Toto): because [] tests a single character, it's called a character class (not like a Java class actually), so only the -is considered special here. For a regex without the braces, you would neeed to escape the other special characters.
Special characters have special meaning using regex and won't be interpreted as the character they are (for example parenthesis are used to make groups, * means 0 or more of the previous character, etc.).
About character class: https://docs.oracle.com/javase/tutorial/essential/regex/char_classes.html
More info:
http://www.regular-expressions.info/characters.html and
http://www.regular-expressions.info/refcharacters.html
I use this site to test my regexes (note that regex engine may vary !):
https://regex101.com/
As said in comment, a mathematic expression is more than just different characters, so if you want to validate, you'll have to do more manual checking.

Why split method does not support $,* etc delimiter to split string

import java.util.StringTokenizer;
class MySplit
{
public static void main(String S[])
{
String settings = "12312$12121";
StringTokenizer splitedArray = new StringTokenizer(settings,"$");
String splitedArray1[] = settings.split("$");
System.out.println(splitedArray1[0]);
while(splitedArray.hasMoreElements())
System.out.println(splitedArray.nextToken().toString());
}
}
In above example if i am splitting string using $, then it is not working fine and if i am splitting with other symbol then it is working fine.
Why it is, if it support only regex expression then why it is working fine for :, ,, ; etc symbols.
$ has a special meaning in regex, and since String#split takes a regex as an argument, the $ is not interpreted as the string "$", but as the special meta character $. One sexy solution is:
settings.split(Pattern.quote("$"))
Pattern#quote:
Returns a literal pattern String for the specified String.
... The other solution would be escaping $, by adding \\:
settings.split("\\$")
Important note: It's extremely important to check that you actually got element(s) in the resulted array.
When you do splitedArray1[0], you could get ArrayIndexOutOfBoundsException if there's no $ symbol. I would add:
if (splitedArray1.length == 0) {
// return or do whatever you want
// except accessing the array
}
If you take a look at the Java docs you could see that the split method take a regex as parameter, so you have to write a regular expression not a simple character.
In regex $ has a specific meaning, so you have to escape it this way:
settings.split("\\$");
The problem is that the split(String str) method expects str to be a valid regular expression. The characters you have mentioned are special characters in regular expression syntax and thus perform a special operation.
To make the regular expression engine take them literally, you would need to escape them like so:
.split("\\$")
Thus given this:
String str = "This is 1st string.$This is the second string";
for(String string : str.split("\\$"))
System.out.println(string);
You end up with this:
This is 1st string.
This is the second strin
Dollar symbol $ is a special character in Java regex. You have to escape it so as to get it working like this:
settings.split("\\$");
From the String.split docs:
Splits this string around matches of the given regular expression.
This method works as if by invoking the two-argument split method with
the given expression and a limit argument of zero. Trailing empty
strings are therefore not included in the resulting array.
On a side note:
Have a look at the Pattern class which will give you an idea as to which all characters you need to escape.
Because $ is a special character used in Regular Expressions which indicate the beginning of an expression.
You should escape it using the escape sequence \$ and in case of Java it should be \$
Hope that helps.
Cheers

How to replace brackets in strings

I have a list of strings that contains tokens.
Token is:
{ARG:token_name}.
I also have hash map of tokens, where key is the token and value is the value I want to substitute the token with.
When I use "replaceAll" method I get error:
java.util.regex.PatternSyntaxException: Illegal repetition
My code is something like this:
myStr.replaceAll(valueFromHashMap , "X");
and valueFromHashMap contains { and }.
I get this hashmap as a parameter.
String.replaceAll() works on regexps. {n,m} is usually repetition in regexps.
Try to use \\{ and \\} if you want to match literal brackets.
So replacing all opening brackets by X works that way:
myString.replaceAll("\\{", "X");
See here to read about regular expressions (regexps) and why { and } are special characters that have to be escaped when using regexps.
As others already said, { is a special character used in the pattern (} too).
You have to escape it to avoid any confusion.
Escaping those manually can be dangerous (you might omit one and make your pattern go completely wrong) and tedious (if you have a lot of special characters).
The best way to deal with this is to use Pattern.quote()
Related issues:
How to escape a square bracket for Pattern compilation
How to escape text for regular expression in Java
Resources:
Oracle.com - JavaSE tutorial - Regular Expressions
replaceAll() takes a regular expression as a parameter, and { is a special character in regular expressions. In order for the regex to treat it as a regular character, it must be escaped by a \, which must be escaped again by another \ in order for Java to accept it. So you must use \\{.
You can remove the curly brackets with .replaceAll() in a line with square brackets
String newString = originalString.replaceAll("[{}]", "X")
eg: newString = "ARG:token_name"
if you want to further separate newString to key and value, you can use .split()
String[] arrayString = newString.split(":")
With arrayString, you can use it for your HashMap with .put(), arrayString[0] and arrayString[1]

Regular Expression for matching parentheses

What is the regular expression for matching '(' in a string?
Following is the scenario :
I have a string
str = "abc(efg)";
I want to split the string at '(' using regular expression.For that i am using
Arrays.asList(Pattern.compile("/(").split(str))
But i am getting the following exception.
java.util.regex.PatternSyntaxException: Unclosed group near index 2
/(
Escaping '(' doesn't seems to work.
Two options:
Firstly, you can escape it using a backslash -- \(
Alternatively, since it's a single character, you can put it in a character class, where it doesn't need to be escaped -- [(]
The solution consists in a regex pattern matching open and closing parenthesis
String str = "Your(String)";
// parameter inside split method is the pattern that matches opened and closed parenthesis,
// that means all characters inside "[ ]" escaping parenthesis with "\\" -> "[\\(\\)]"
String[] parts = str.split("[\\(\\)]");
for (String part : parts) {
// I print first "Your", in the second round trip "String"
System.out.println(part);
}
Writing in Java 8's style, this can be solved in this way:
Arrays.asList("Your(String)".split("[\\(\\)]"))
.forEach(System.out::println);
I hope it is clear.
You can escape any meta-character by using a backslash, so you can match ( with the pattern
\(.
Many languages come with a build-in escaping function, for example, .Net's Regex.Escape or Java's Pattern.quote
Some flavors support \Q and \E, with literal text between them.
Some flavors (VIM, for example) match ( literally, and require \( for capturing groups.
See also: Regular Expression Basic Syntax Reference
For any special characters you should use '\'.
So, for matching parentheses - /\(/
Because ( is special in regex, you should escape it \( when matching. However, depending on what language you are using, you can easily match ( with string methods like index() or other methods that enable you to find at what position the ( is in. Sometimes, there's no need to use regex.

regex to find substring between special characters

I am running into this problem in Java.
I have data strings that contain entities enclosed between & and ; For e.g.
&Text.ABC;, &Links.InsertSomething;
These entities can be anything from the ini file we have.
I need to find these string in the input string and remove them. There can be none, one or more occurrences of these entities in the input string.
I am trying to use regex to pattern match and failing.
Can anyone suggest the regex for this problem?
Thanks!
Here is the regex:
"&[A-Za-z]+(\\.[A-Za-z]+)*;"
It starts by matching the character &, followed by one or more letters (both uppercase and lower case) ([A-Za-z]+). Then it matches a dot followed by one or more letters (\\.[A-Za-z]+). There can be any number of this, including zero. Finally, it matches the ; character.
You can use this regex in java like this:
Pattern p = Pattern.compile("&[A-Za-z]+(\\.[A-Za-z]+)*;"); // java.util.regex.Pattern
String subject = "foo &Bar; baz\n";
String result = p.matcher(subject).replaceAll("");
Or just
"foo &Bar; baz\n".replaceAll("&[A-Za-z]+(\\.[A-Za-z]+)*;", "");
If you want to remove whitespaces after the matched tokens, you can use this re:
"&[A-Za-z]+(\\.[A-Za-z]+)*;\\s*" // the "\\s*" matches any number of whitespace
And there is a nice online regular expression tester which uses the java regexp library.
http://www.regexplanet.com/simple/index.html
You can try:
input=input.replaceAll("&[^.]+\\.[^;]+;(,\\s*&[^.]+\\.[^;]+;)*","");
See it

Categories