Split a word by a char in Java [duplicate] - java

This question already has answers here:
How to split a string, but also keep the delimiters?
(24 answers)
How do I split a string in Java?
(39 answers)
Closed 3 years ago.
Consider the following example. I would like to divide the String into two parts by the char 'T'
// input
String toDivideStr = "RaT15544";
// output
first= "RaT";
second = "15544";
I've tried this:
String[] first = toDivideStr.split("T",0);
Output:
first = "Ra"
second = "15544"
How do I achieve this?

What you need to to, is locate the last "T", then split:
StringToD.substring(StringToD.lastIndexOf("T") + 1)

You could use a positive lookahead to assert a digit and a positive lookbehind to assert RaT.
(?<=RaT)(?=\\d)
For example:
String str = "RaT15544";
for (String element : str.split("(?<=RaT)(?=\\d)"))
System.out.println(element);
Regex demo | Java demo

You can use positive look-ahead with split limit parameter for this. (?=\\d)
With only T in the split method parameter, what happens is the regex engine consumes this T. Hence the two string split that occurs doesn't have T. To avoid consuming the characters, we can use non-consumeing look-ahead.
(?=\\d) - This will match the first number that is encountered but it will not consume this number
public static void main(String[] args) {
String s = "RaT15544";
String[] ss = s.split("(?=\\d)", 2);
System.out.println(ss[0] + " " + ss[1]);
}

The below regex can be used to split the alphabets and numbers separately.
String StringToD = "RaT15544";
String[] parts = StringToD.split("(?<=\\d)(?=\\D)|(?<=\\D)(?=\\d)");
System.out.println(parts[0]);
System.out.println(parts[1]);

Related

How to split a string every N words [duplicate]

This question already has answers here:
How to split a String by space
(17 answers)
How to split a string array into small chunk arrays in java?
(17 answers)
Splitting at every n-th separator, and keeping the character
(4 answers)
Closed last year.
I want to split one big string into smaller parts, so given for example:
"A B C D E F G H I J K L"
I want to get array (String []): [A,B,C,D], [E,F,G,H], [I,J,K,L]
Is there any regex for that or I need to do that manually so first to split every space and then concat every N words. ??
You can create a regex that describes this pattern.
e.g. "((?:\w+\s*){4})"
Or in simple words:
The \w+\s* part means that there are 1 or multiple word-characters (e.g. text, digits) followed by 0, 1 or multiple whitespace characters.
It is surrounded in braces and followed by {4} to indicate that we want this to occur 4 times.
Finally that again is wrapped in braces, because we want to capture that result.
By contrast the braces which were used to specify {4} are preceded by a (?: ...) prefix, which makes it a "non-capturing-group". We don't want to capture the individual matches just yet.
You can use that pattern in java to extract each chunk of 4 occurrences.
And than next, you can simply split each individual result with a second regex, \s+ ( = whitespace)
Edit
One more thing, you may notice that the first matched group also contains whitespace at the end. You can get rid of that with a more advanced regex: ((?:\w+\s+){3}(?:\w+))\s*
You could use regex for this:
e.g.:
String x = "AAS BASD CAFAS DAFASF EASFASF FAFSASF GA HASF IAS JAS KAS LSA";
ArrayList<String> found = new ArrayList<>();
Pattern pattern = Pattern.compile("(\\w+\\s\\w+\\s\\w+)");
Matcher m = pattern.matcher(x);
while (m.find()) {
String s = m.group();
found.add(s);
}
//if you want to convert your List to an Array
String[] result = found.toArray(new String[0]);
System.out.println(Arrays.toString(result));
Result: [AAS BASD CAFAS DAFASF, EASFASF FAFSASF GA HASF, IAS JAS KAS LSA]
This pattern ("(\\w+\\s\\w+\\s\\w+\\s\\w+)") matches 4 words separated by one space. The loop iterates over every found match and adds it to your result list.
There are multiple ways you can achieve this,
for ex. let your string be
String str = "A B C D E F G H I J K L";
one way to split it would be using regular expression
java.util.Arrays.toString(str.split("(?<=\\G....)"))
here the .... represent how many characters in each string, another way to specify the pattern would be .{4}
another way would be
Iterable<String> strArr = Splitter.fixedLength(3).split(str );
there could be more ways to achieve the same

String Split method not Giving desired results [duplicate]

This question already has answers here:
Java string split with "." (dot) [duplicate]
(4 answers)
Closed 7 years ago.
I am trying to use Split method of String in java I have example like this.
String number = Math.random() * 100 + "";
System.out.println("Number is : " + number);
String[] seprate = number.split(".");
System.out.println(seprate.length);
it should give me 2 Stack of array i mean 2 array element if value is like e.g. 67.90512897385857
but its not giving value like that
String number = Math.random() * 100 + "";
System.out.println("Number is : " + number);
String[] seprate = number.split(".");
System.out.println(seprate.length);
System.out.println(seprate[1]);
its giving arrayindexoutbound exception.
Someone give idea why its giving like that?
The String#split method takes a regular expression.
The "." in there means any character.
Escape your "." as such to signal a literal dot: number.split("\\.").
As Pieter De Bie points out, using java.util.regex.Pattern to safely escape your literals when passing literals to an argument that is going to be interpreted as a regular expression will help you a good deal.
In this case, you could use: number.split(Pattern.quote("."))
You need to escape the dot. The split method takes a regular expression. From the docs:
Parameters:regex the delimiting regular expression
String[] seprate = number.split("\\.");
Split works with regex and you should use like this
number.split("\\.")
Pay attention to the documentation:
public String[] split(String regex)
Splits this string around matches of the given regular expression.
In a regular expression, . is any character (except newlines, usually).
So you are splitting at every character.
If you want to match only a dot, "\\." will work.
Double f = Math.random() * 100;
String number = String.valueOf(f);
System.out.println("Number is : " + number);
String[] seprate = number.split("\\.");
System.out.println(seprate.length);
Please use this link for ur question.
The split() method in Java does not work on a dot (.)

Separate numbers in math expression [duplicate]

This question already has answers here:
Parsing an arithmetic expression and building a tree from it in Java
(5 answers)
Closed 8 years ago.
I have a math expression stored as a String:
String math = "12+3=15";
I want to separate the string into the following:
int num1 (The first number, 12)
int num2 (The second number, 3)
String operator (+)
int answer (The answer, 15)
(num1 and num2 can be digits between 0-20, and operator can be either +,-,*,/)
What is the easiest way to achieve this? I was thinking about regular expressions, but I'm not sure how to do it.
Now, don't scowl at me.. You asked for the simplest solution :P
public static void main(String[] args) {
String math = "12+3=15";
Pattern p = Pattern.compile("(\\d+)(.)(\\d+)=(\\d+)");
Matcher m = p.matcher(math);
while (m.find()) {
System.out.println(m.group(1));
System.out.println(m.group(2));
System.out.println(m.group(3));
System.out.println(m.group(4));
}
}
O/P :
12
+
3
15
EDIT : (\\d+)(.)(\\d+)=(\\d+) -->
\\d+ matches one or more digits.
. matches anything
() --> captures whatever is inside it
(\\d+)(.)(\\d+)=(\\d+) --> captures one or more digits followed by anything (+-* etc) then again one or more digits and ignores the "=" and then captures digits again.
captured groups are named from 1 to n.. group 0 represents the entire string.
\\b(\\d+)\\b|([+*\/-])
You can simply do this and grab the capture.See demo.
https://regex101.com/r/wU7sQ0/30
Or simply split by \\b.See demo.
https://regex101.com/r/wU7sQ0/31
var re = /\b(\d+)\b|([+=\\-])/gm;
var str = '12+3=15';
var m;
while ((m = re.exec(str)) != null) {
if (m.index === re.lastIndex) {
re.lastIndex++;
}
// View your result using the m-variable.
// eg m[0] etc.
}

Split a string in Java based on a separator unless separator is escaped with String.split [duplicate]

This question already has an answer here:
Java regex : matching a char except when preceded by another char
(1 answer)
Closed 8 years ago.
I got a string in java that I would like to split in parts on following criteria:
the '#' char is a separator
if '#' is escaped via backslash then is should not be considered a separator
i.e.
"abc#xyz#kml\#ijk"
should be split into
"abc", "xyz", "kml\#ijk"
I can do it easily with StringTokenizer and add some logic for the escape char but I would like to get it via one-liner String.split call with the correct regex. So far my "best" attempt is following:
public static void main(String[] args) {
String toSplit = "abc#xyz#kml\\#ijk";
String[] arr = toSplit.split("[^\\\\]#");
System.out.println(Arrays.toString(arr));
}
and the result is:
[ab, xy, kml#ijk]
The last letter of the first two parts is cut out.
Any idea how to avoid that?
Have you looked into lookbehinds?
public static void main(String[] args) {
String toSplit = "abc#xyz#kml\\#ijk";
String[] arr = toSplit.split("(?<!\\\\)#");
System.out.println(Arrays.toString(arr));
}

Java Regular expressions issue - Can't match two strings in the same line [duplicate]

This question already has answers here:
What do 'lazy' and 'greedy' mean in the context of regular expressions?
(13 answers)
Closed 8 years ago.
just experiencing some problems with Java Regular expressions.
I have a program that reads through an HTML file and replaces any string inside the #VR# characters, i.e. #VR#Test1 2 3 4#VR#
However my issue is that, if the line contains more than two strings surrounded by #VR#, it does not match them. It would match the leftmost #VR# with the rightmost #VR# in the sentence and thus take whatever is in between.
For example:
#VR#Google#VR#
My code would match
URL-GOES-HERE#VR#" target="_blank" style="color:#f4f3f1; text-decoration:none;" title="ContactUs">#VR#Google
Here is my Java code. Would appreciate if you could help me to solve this:
Pattern p = Pattern.compile("#VR#.*#VR#");
Matcher m;
Scanner scanner = new Scanner(htmlContent);
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
m = p.matcher(line);
StringBuffer sb = new StringBuffer();
while (m.find()) {
String match_found = m.group().replaceAll("#VR#", "");
System.out.println("group: " + match_found);
}
}
I tried replacing m.group() with m.group(0) and m.group(1) but nothing. Also m.groupCount() always returns zero, even if there are two matches as in my example above.
Thanks, your help will be very much appreciated.
Your problem is that .* is "greedy"; it will try to match as long a substring as possible while still letting the overall expression match. So, for example, in #VR# 1 #VR# 2 #VR# 3 #VR#, it will match 1 #VR# 2 #VR# 3.
The simplest fix is to make it "non-greedy" (matching as little as possible while still letting the expression match), by changing the * to *?:
Pattern p = Pattern.compile("#VR#.*?#VR#");
Also m.groupCount() always returns zero, even if there are two matches as in my example above.
That's because m.groupCount() returns the number of capture groups (parenthesized subexpressions, whose corresponding matched substrings retrieved using m.group(1) and m.group(2) and so on) in the underlying pattern. In your case, your pattern has no capture groups, so m.groupCount() returns 0.
You can try the regular expression:
#VR#(((?!#VR#).)+)#VR#
Demo:
private static final Pattern REGEX_PATTERN =
Pattern.compile("#VR#(((?!#VR#).)+)#VR#");
public static void main(String[] args) {
String input = "#VR#Google#VR# ";
System.out.println(
REGEX_PATTERN.matcher(input).replaceAll("$1")
); // prints "Google "
}

Categories