Splitting a space separated string - java

String numbers = "5 1 5 1";
So, is it:
String [] splitNumbers = numbers.split();
or:
String [] splitNumbers = numbers.split("\s+");
Looking for: ["5","1","5","1"]
Any idea why neither of the .split lines will work? I tried reading answers about the regex, but I'm not getting anywhere.

In your above case split("\s+");, you need to escape \ with another backslash, which is:
split("\\s+");
Or
split(" "); also can do it
Note that split("\\s+"); split any length of whitespace including newline(\n), tab(\t) while split(" "); will split only single space.
For example, when you have string separated with two spaces, say "5 1 5 1" ,
using split("\\s+"); you will get {"5","1","5","1"}
while using split(" "); you will get {"5","","1","5","1"}

You must escape the regex with an additional \ since \ denotes the escape character:
public static void main(String[] args) {
String numbers = "5 1 5 1";
String[] tokens = numbers.split("\\s+");
for(String token:tokens){
System.out.println(token);
}
}
So the additional \ escapes the next \ which is then treated as the literal \.
When using \\s+ the String will be split on multiple whitespace characters (space, tab, etc).

Related

Splitting a String by number of delimiters

I am trying to split a string into a string array, there might be number of combinations,
I tried:
String strExample = "A, B";
//possible option are:
1. A,B
2. A, B
3. A , B
4. A ,B
String[] parts;
parts = strExample.split("/"); //Split the string but doesnt remove the space in between them so the 2 item in the string array is space and B ( B)
parts = strExample.split("/| ");
parts = strExample.split(",|\\s+");
Any guidance would be appreciated
To split with comma enclosed with optional whitespace chars you may use
s.split("\\s*,\\s*")
The \s*,\s* pattern matches
\s* - 0+ whitespaces
, - a comma
\s* - 0+ whitespaces
In case you want to make sure there are no leading/trailing spaces, consider trim()ming the string before splitting.
You can use
parts=strExample.split("\\s,\\s*");
for your case.

Split String end with special characters - Java

I have a string which I want to first split by space, and then separate the words from the special characters.
For Example, let's say the input is:
Hi, How are you???
I already wrote the logic to split by space here:
String input = "Hi, How are you???";
String[] words = input.split("\\\\s+");
Now, I want to seperate each word from the special character.
For example: "Hi," to {"Hi", ","} and "you???" to {"you", "???"}
If the string does not end with any special characters, just ignore it.
Can you please help me with the regular expression and code for this in Java?
Following regex should help you out:
(\s+|[^A-Za-z0-9]+)
This is not a java regex, so you need to add a backspace.
It matches on whitespaces \s+ and on strings of characters consisting not of A-Za-z0-9. This is a workaround, since there isn't (or at least I do not know of) a regex for special characters.
You can test this regex here.
If you use this regex with the split function, it will return the words. Not the special characters and whitespaces it machted on.
UPDATE
According to this answer here on SO, java has\P{Alpha}+, which matches any non-alphabetic character. So you could try:
(\s|\P{Alpha})+
I want to separate each word from the special character.
For example: "Hi," to {"Hi", ","} and "you???" to {"you", "???"}
regex to achieve above behavior
String stringToSearch ="Hi, you???";
Pattern p1 = Pattern.compile("[a-z]{0}\\b");
String[] str = p1.split(stringToSearch);
System.out.println(Arrays.asList(str));
output:
[Hi, , , you, ???]
#mike is right...we need to split the sentence on special characters, leaving out the words. Here is the code:
`public static void main(String[] args) {
String match = "Hi, How are you???";
String[] words = match.split("\\P{Alpha}+");
for(String word: words) {
System.out.print(word + " ");
}
}`

How to split a string by space+ escaping initial spaces in Java?

I have:
String s = "Hello world";
or
String s = " Hello world ";
the result should be:
String[] splited = s.split("REGEX");
splited[0].equals(" Hello"); \\true
splited[1].equals("world "); \\true
I did like this: s.trim().split(" +"); but I have lost first spaces in splited[0], but the space should stay.
How can I do it whith regex?
You could combine negative look ahead/behind assertions
String[] array = s.split("(?<!^\\s*)\\s+(?=\\S)");
(?<!^\\s*) Match start of string + 0 or more whitespaces
(?=\\S) Match non-whitespace
A limited (to 1000 spaces at the begining) way:
String[] splited = s.split("(?<!\\A\\s{0,1000})\\s+(?=\\S)");
details:
(?<!\\A\\s{0,1000}) # not preceded by white-spaces at the start of the string
\\s+ # white-spaces
(?=\\S) # followed by a non white-space character
or the same strictly for spaces (not for tabs or newlines...):
String[] splited = s.split("(?<!\\A {0,1000}) +(?=[^ ])");

How to replace \ followed by letters in java String?

I need to delete all tokens that are started with \ and followed by any characters.
I created such a pattern:
input.replaceAll("\\[a-zA-Z0-9]*", "");
But it doesn't work because it doesn't delete \rad from string 5 4\rad.
EDIT:
public static void main(String[] args)
{
String input="Wolf 3 3\4par";
String replaceAll = input.replaceAll("\\\\[a-zA-Z0-9]*", "");
System.out.println("replaceAll=" + replaceAll);
}
Thank you!
The \ is special both in string literals and in regular expressions. To put an actual \ in a regular expression, you have to escape it twice. You also have to assign the result somewhere, which it wasn't clear from your question you were doing. So:
input = input.replaceAll("\\\\[a-zA-Z0-9]*", "");
Complete example: Live Copy
import java.util.*;
public class Temp {
public static void main(String[] args) {
String input = "4 5 \\rad";
input = input.replaceAll("\\\\[a-zA-Z0-9]*", "");
System.out.println(input);
}
}
Output:
4 5
To create \ literal in regex you need to pass \\ to regex engine. But to create \ literal in String you also have to escape it so you need to write it as "\\".
\ literal in regex engine
\\ regex pattern
"\\\\" String representing regex pattern
Now you are using one \ in your regex pattern regex engine sees it as \[ which escapes [ making it simple literal.
Try this way
input.replaceAll("\\\\[a-zA-Z0-9]*", "");
From
Sorry, but my String is exactly 5 4\rad. Indeed how to delete \rad? – Volodymyr Levytskyi
Try
String k= "5 4\rad";
System.out.println(k.replaceAll("\r\\w*", ""));
Output
5 4

Java Regex Help: Splitting String on spaces, "=>", and commas

I need to split a string on any of the following sequences:
1 or more spaces
0 or more spaces, followed by a comma, followed by 0 or more spaces,
0 or more spaces, followed by "=>", followed by 0 or more spaces
Haven't had experience doing Java regexs before, so I'm a little confused. Thanks!
Example:
add r10,r12 => r10
store r10 => r1
Just create regex matching any of your three cases and pass it into split method:
string.split("\\s*(=>|,|\\s)\\s*");
Regex here means literally
Zero or more whitespaces (\\s*)
Arrow, or comma, or whitespace (=>|,|\\s)
Zero or more whitespaces (\\s*)
You can replace whitespace \\s (detects spaces, tabs, line breaks, etc) with plain space character if necessary.
Strictly translated
For simplicity, I'm going to interpret you indication of "space" () as "any whitespace" (\s).
Translating your spec more or less "word for word" is to delimit on any of:
1 or more spaces
\s+
0 or more spaces (\s*), followed by a comma (,), followed by 0 or more spaces (\s*)
\s*,\s*
0 or more spaces (\s*), followed by a "=>" (=>), followed by 0 or more spaces (\s*)
\s*=>\s*
To match any of the above: (\s+|\s*,\s*|\s*=>\s*)
Reduced form
However, your spec can be "reduced" to:
0 or more spaces
\s*,
followed by either a space, comma, or "=>"
(\s|,|=>)
followed by 0 or more spaces
\s*
Put it all together: \s*(\s|,|=>)\s*
The reduced form gets around some corner cases with the strictly translated form that makes some unexpected empty "matches".
Code
Here's some code:
import java.util.regex.Pattern;
public class Temp {
// Strictly translated form:
//private static final String REGEX = "(\\s+|\\s*,\\s*|\\s*=>\\s*)";
// "Reduced" form:
private static final String REGEX = "\\s*(\\s|=>|,)\\s*";
private static final String INPUT =
"one two,three=>four , five six => seven,=>";
public static void main(final String[] args) {
final Pattern p = Pattern.compile(REGEX);
final String[] items = p.split(INPUT);
// Shorthand for above:
// final String[] items = INPUT.split(REGEX);
for(final String s : items) {
System.out.println("Match: '"+s+"'");
}
}
}
Output:
Match: 'one'
Match: 'two'
Match: 'three'
Match: 'four'
Match: 'five'
Match: 'six'
Match: 'seven'
String[] splitArray = subjectString.split(" *(,|=>| ) *");
should do it.

Categories