Java regex to split a string by using different delimiters - java

Suppose I want to split a string by either space character or the %20 string, how should I write my regex?
I tried the following, but it didn't work.
String regex = "[\\s+, %20]";
String str1 = "abc%20xyz";
String str2 = "abc xyz";
str1.split(regex);
str2.split(regex);
The regex doesn't seem to work on str1.

use the alternation |:
String regex = "(?:\\s+|%20)+";

String regex = "(\\s{1}+|%20{1}+)";

If you want to split by ONE space or ONE "%20", try this:
String regex = "(\\s|%20)";
If you want to split by AT LEAST ONE space or AT LEAST ONE "%20", then try this:
String regex = "(\\s+|(%20)+)";

Related

Split or tokenise a java String using a substring as delimiter

How to split or tokenise a String in java not based on regex but based on a substring?
String str = "{A={111={i=[a,b,c],ii=[e,f]}, 222={iii=[a,e]}}, B={333={i= [b,c]}}};
Now I want to tokenise or split the string based on substring "}}," and not regex "}},".
Although the String.split(String regex) function specifies that it takes a regular expression as a parameter, that does not stop you from escaping any special characters and splitting on a literal string.
To escape special characters in a regular expression, you can make use of the Pattern.quote(String s) function, or you can escape the individual characters using backslashes \\:
String escapedStr = Pattern.quote("}},");
String alternativeEscapedStr = "\\}\\},";
For the example you have provided however, you shouldn't need to escape anything:
String str = "{A={111={i=[a,b,c],ii=[e,f]}, 222={iii=[a,e]}}, B={333={i= [b,c]}}}";
String[] splitStr = str.split(Pattern.quote("}},"));
System.out.println(Arrays.toString(splitStr));
String[] splitStr2 = str.split("}},");
System.out.println(Arrays.toString(splitStr2));
Output:
[{A={111={i=[a,b,c],ii=[e,f]}, 222={iii=[a,e], B={333={i= [b,c]}}}]
[{A={111={i=[a,b,c],ii=[e,f]}, 222={iii=[a,e], B={333={i= [b,c]}}}]
String str = "{A={111={i=[a,b,c],ii=[e,f]}, 222={iii=[a,e]}}, B={333={i= [b,c]}}}";
String[] split = str.trim().split("}},");
Arrays.stream(split).forEach(s-> System.out.println(s));

Parse signed number from string

I have string like:
"-------5548481818fgh7hf8ghf----fgh54f4578"
I don't want to parse using Pattern and Matcher. I have code:
string.replaceAll("regex", ""));
How to make regex to exclude all symbols except a "-" to get string like:
-554848181878544578
You can use this negative lookahead regex:
String s = "-------5548481818fgh7hf8ghf----fgh54f4578";
String r = s.replaceAll("(?!^[-+])\\D+", "");
//=> -554848181878544578
(?!^-)\D will replace each non-digit except the hyphen at start.
RegEx Demo
This will work
String Str = new String("-------5548481818fgh7hf8ghf----fgh54f4578-");
String tmp = Str.replaceAll("([-+])+|([^\\d])","$1").replaceAll("\\d[+-](\\d|$)","");
System.out.println(tmp);
Ideone Demo
Alternative: Grab the opposite, instead of replacing the negative. Seems to be arbitrary that you've picked to remove characters you don't want, instead of grabbing the characters you do want. Example in javascript:
s = "-------5548481818fgh7hf8ghf----fgh54f4578"
s = '-' + s.match(/[0-9]+/g).join('')
// "-554848181878544578"

Eliminating Unicode Characters and Escape Characters from String

I want to remove all Unicode Characters and Escape Characters like (\n, \t) etc. In short I want just alphanumeric string.
For example :
\u2029My Actual String\u2029
\nMy Actual String\n
I want to fetch just 'My Actual String'. Is there any way to do so, either by using a built in string method or a Regular Expression ?
Try
String stg = "\u2029My Actual String\u2029 \nMy Actual String";
Pattern pat = Pattern.compile("(?!(\\\\(u|U)\\w{4}|\\s))(\\w)+");
Matcher mat = pat.matcher(stg);
String out = "";
while(mat.find()){
out+=mat.group()+" ";
}
System.out.println(out);
The regex matches all things except unicode and escape characters. The regex pictorially represented as:
Output:
My Actual String My Actual String
Try this:
anyString = anyString.replaceAll("\\\\u\\d{4}|\\\\.", "");
to remove escaped characters. If you also want to remove all other special characters use this one:
anyString = anyString.replaceAll("\\\\u\\d{4}|\\\\.|[^a-zA-Z0-9\\s]", "");
(I guess you want to keep the whitespaces, if not remove \\s from the one above)

Manipulating IP addresses - split string on '.' character

String address = "192.168.1.1";
I want to split the address and the delimiter is the point.
So I used this code:
String [] split = address.split(".");
But it didn't work, when I used this code it works:
String [] split = address.split("\\.");
so why splitting the dot in IPv4 address is done like this : ("\\.") ?
You need to escape the "." as split takes a regex. But you also need to escape the escape as "\." won't work in a java String:
String [] split = address.split("\\.");
This is because the backslash in a java String denotes the beginning of a character literal.
You should split like this, small tip use Pattern.compile as well
String address = "192.168.1.1";
String[] split = address.split("\\.");// you can replace it with private static final Pattern.

How can I split a string by two delimiters?

I know that you can split your string using myString.split("something"). But I do not know how I can split a string by two delimiters.
Example:
mySring = "abc==abc++abc==bc++abc";
I need something like this:
myString.split("==|++")
What is its regularExpression?
Use this :
myString.split("(==)|(\\+\\+)")
How I would do it if I had to split using two substrings:
String mainString = "This is a dummy string with both_spaces_and_underscores!"
String delimiter1 = " ";
String delimiter2 = "_";
mainString = mainString.replaceAll(delimiter2, delimiter1);
String[] split_string = mainString.split(delimiter1);
Replace all instances of second delimiter with first and split with first.
Note: using replaceAll allows you to use regexp for delimiter2. So, you should actually replace all matches of delimiter2 with some string that matches delimiter1's regexp.
You can use this
mySring = "abc==abc++abc==bc++abc";
String[] splitString = myString.split("\\W+");
Regular expression \W+ ---> it will split the string based upon non-word character.
Try this
String str = "aa==bb++cc";
String[] split = str.split("={2}|\\+{2}");
System.out.println(Arrays.toString(split));
The answer is an array of
[aa, bb, cc]
The {2} matches two characters of the proceding character. That is either = or + (escaped)
The | matches either side
I am escaping the \ in java so the regex is actually ={2}|\+{2}

Categories