Split string against some characters except the # character - java

I want to split a string against the following characters
~!#$%^&*()_+­=<>,.?/:;"'{}|[]\, \n,\t, space
I tried to use \\s regex delimiter but i don't want the # included as the split character so that a string like this is #funny should result to this is #funny as the resulting values.
I have tried the following but it doesn't work.
this is #funny".split("\\s")
but it doesn't work. Any ideas?

Just specify the characters you want in square bracket, which means any of. Single escape Java characters (like \") and double escape Regex special characters (like \\[):
#Test
public void testName() throws Exception
{
String[] split = "this is #funny".split("[~!#$%^&*()_+­=<>,.?/:;\"'{}|\\[\\]\\\\ \\n\\t]");
for (String string : split)
{
logger.debug(string);
}
}

User replaceAll(String regex,String replacement) method from String.
String result = "this is #funny".replaceAll("[~!#$%^&*()_+­=<>,.?/:;\"'{}|\\[\\]\\,\\n\\t]", "");
System.out.println(result);

You can try to implement this:
String[] split = "this&is%a#funny^string".split("[^#\\p{Alnum}]|\\s+");
for (String string : split){
System.out.println(string);
}
Also check the Java API (Patterns) for more information on how to process strings.

It look like this will work for you:
String[] split = str.split("[^a-zA-Z&&[^#]]+");
This uses a character class subtraction to split on non-letter chars, except the hash.
Here's some test code:
String str = "this is #funny";
String[] split = str.split("[^a-zA-Z&&[^#]]+");
System.out.println(Arrays.toString(split));
Output:
[this, is, #funny]

Related

How to split string using regex without consuming the splitter part?

How would I split a string without consuming the splitter part?
Something like this but instead : I'm using #[a-fA-F0-9]{6} regex.
String from = "one:two:three";
String[] to = ["one",":","two",":","three"];
I already tried using commons lib since it has StringUtils.splitPreserveAllTokens() but it does not work with regex.
EDIT: I guess I should have been more specific, but this is more of what I was looking for.
String string = "Some text here #58a337test #a5fadbtest #123456test as well.
#58a337Word#a5fadbwith#123456more hex codes.";
String[] parts = string.split("#[a-fA-F0-9]{6}");
/*Output: ["Some text here ","#58a337","test ","#a5fadb","test ","#123456","test as well. ",
"#58a337","Word","#a5fadb","with","#123456","more hex codes."]*/
EDIT 2: Solution!
final String string = "Some text here #58a337test #a5fadbtest #123456test as
well. #58a337Word#a5fadbwith#123456more hex codes.";
String[] parts = string.split("(?=#.{6})|(?<=#.{6})");
for(String s: parts) {
System.out.println(s);
}
Output:
Some text here
#58a337
test
#a5fadb
test
#123456
test as well.
#58a337
Word
#a5fadb
with
#123456
more hex codes.
You could use \\b (word-break, \ escaped) to split in your case,
final String string = "one:two:three";
String[] parts = string.split("\\b");
for(String s: parts) {
System.out.println(s);
}
Try it online!
The answer given by #vrintle +1 is probably the tightest code which can be written for your exact input. But, assuming you might have other non word characters in the input besides :, then you could also split more precisely using lookarounds:
String from = "one:two:three";
String[] parts = from.split("(?<=:)|(?=:)");
System.out.println(Arrays.toString(parts));
This prints:
[one, :, two, :, three]

Split String end with special characters - Java

I have a string which I want to first split by space, and then separate the words from the special characters.
For Example, let's say the input is:
Hi, How are you???
I already wrote the logic to split by space here:
String input = "Hi, How are you???";
String[] words = input.split("\\\\s+");
Now, I want to seperate each word from the special character.
For example: "Hi," to {"Hi", ","} and "you???" to {"you", "???"}
If the string does not end with any special characters, just ignore it.
Can you please help me with the regular expression and code for this in Java?
Following regex should help you out:
(\s+|[^A-Za-z0-9]+)
This is not a java regex, so you need to add a backspace.
It matches on whitespaces \s+ and on strings of characters consisting not of A-Za-z0-9. This is a workaround, since there isn't (or at least I do not know of) a regex for special characters.
You can test this regex here.
If you use this regex with the split function, it will return the words. Not the special characters and whitespaces it machted on.
UPDATE
According to this answer here on SO, java has\P{Alpha}+, which matches any non-alphabetic character. So you could try:
(\s|\P{Alpha})+
I want to separate each word from the special character.
For example: "Hi," to {"Hi", ","} and "you???" to {"you", "???"}
regex to achieve above behavior
String stringToSearch ="Hi, you???";
Pattern p1 = Pattern.compile("[a-z]{0}\\b");
String[] str = p1.split(stringToSearch);
System.out.println(Arrays.asList(str));
output:
[Hi, , , you, ???]
#mike is right...we need to split the sentence on special characters, leaving out the words. Here is the code:
`public static void main(String[] args) {
String match = "Hi, How are you???";
String[] words = match.split("\\P{Alpha}+");
for(String word: words) {
System.out.print(word + " ");
}
}`

How to replace all numbers in java string

I have string like this String s="ram123",d="ram varma656887"
I want string like ram and ram varma so how to seperate string from combined string
I am trying using regex but it is not working
PersonName.setText(cursor.getString(cursor.getColumnIndex(cursor
.getColumnName(1))).replaceAll("[^0-9]+"));
The correct RegEx for selecting all numbers would be just [0-9], you can skip the +, since you use replaceAll.
However, your usage of replaceAll is wrong, it's defined as follows: replaceAll(String regex, String replacement). The correct code in your example would be: replaceAll("[0-9]", "").
You can use the following regex: \d for representing numbers. In the regex that you use, you have a ^ which will check for any characters other than the charset 0-9
String s="ram123";
System.out.println(s);
/* You don't need the + because you are using the replaceAll method */
s = s.replaceAll("\\d", ""); // or you can also use [0-9]
System.out.println(s);
To remove the numbers, following code will do the trick.
stringname.replaceAll("[0-9]","");
Please do as follows
String name = "ram varma656887";
name = name.replaceAll("[0-9]","");
System.out.println(name);//ram varma
alternatively you can do as
String name = "ram varma656887";
name = name.replaceAll("\\d","");
System.out.println(name);//ram varma
also something like given will work for you
String given = "ram varma656887";
String[] arr = given.split("\\d");
String data = new String();
for(String x : arr){
data = data+x;
}
System.out.println(data);//ram varma
i think you missed the second argument of replace all. You need to put a empty string as argument 2 instead of actually leaving it empty.
try
replaceAll(<your regexp>,"")
you can use Java - String replaceAll() Method.
This method replaces each substring of this string that matches the given regular expression with the given replacement.
Here is the syntax of this method:
public String replaceAll(String regex, String replacement)
Here is the detail of parameters:
regex -- the regular expression to which this string is to be matched.
replacement -- the string which would replace found expression.
Return Value:
This method returns the resulting String.
for your question use this
String s = "ram123", d = "ram varma656887";
System.out.println("s" + s.replaceAll("[0-9]", ""));
System.out.println("d" + d.replaceAll("[0-9]", ""));

How can I split a string by two delimiters?

I know that you can split your string using myString.split("something"). But I do not know how I can split a string by two delimiters.
Example:
mySring = "abc==abc++abc==bc++abc";
I need something like this:
myString.split("==|++")
What is its regularExpression?
Use this :
myString.split("(==)|(\\+\\+)")
How I would do it if I had to split using two substrings:
String mainString = "This is a dummy string with both_spaces_and_underscores!"
String delimiter1 = " ";
String delimiter2 = "_";
mainString = mainString.replaceAll(delimiter2, delimiter1);
String[] split_string = mainString.split(delimiter1);
Replace all instances of second delimiter with first and split with first.
Note: using replaceAll allows you to use regexp for delimiter2. So, you should actually replace all matches of delimiter2 with some string that matches delimiter1's regexp.
You can use this
mySring = "abc==abc++abc==bc++abc";
String[] splitString = myString.split("\\W+");
Regular expression \W+ ---> it will split the string based upon non-word character.
Try this
String str = "aa==bb++cc";
String[] split = str.split("={2}|\\+{2}");
System.out.println(Arrays.toString(split));
The answer is an array of
[aa, bb, cc]
The {2} matches two characters of the proceding character. That is either = or + (escaped)
The | matches either side
I am escaping the \ in java so the regex is actually ={2}|\+{2}

How to split a String array?

Intention is to take a current line (String that contains commas), replace white space with "" (Trim space) and finally store split String elements into the array.
Why does not this work?
String[] textLine = currentInputLine.replace("\\s", "").split(",");
On regex vs non-regex methods
The String class has the following methods:
Non-regex methods:
String replace(char oldChar, char newChar)
String replace(CharSequence target, CharSequence replacement)
boolean startsWith(String prefix)
boolean endsWith(String suffix)
boolean contains(CharSequence s)
Regex methods:
String replaceAll(String regex, String replacement)
String replaceFirst(String regex, String replacement)
String[] split(String regex)
boolean matches(String regex)
So here we see the immediate cause of your problem: you're using a regex pattern in a non-regex method. Instead of replace, you want to use replaceAll.
Other common pitfalls include:
split(".") (when a literal period is meant)
matches("pattern") is a whole-string match!
There's no contains("pattern"); use matches(".*pattern.*") instead
On Guava's Splitter
Depending on your need, String.replaceAll and split combo may do the job adequately. A more specialized tool for this purpose, however, is Splitter from Guava.
Here's an example to show the difference:
public static void main(String[] args) {
String text = " one, two, , five (three sir!) ";
dump(text.replaceAll("\\s", "").split(","));
// prints "[one] [two] [] [five(threesir!)] "
dump(Splitter.on(",").trimResults().omitEmptyStrings().split(text));
// prints "[one] [two] [five (three sir!)] "
}
static void dump(String... ss) {
dump(Arrays.asList(ss));
}
static void dump(Iterable<String> ss) {
for (String s : ss) {
System.out.printf("[%s] ", s);
}
System.out.println();
}
Note that String.split can not omit empty strings in the beginning/middle of the returned array. It can omit trailing empty strings only. Also note that replaceAll may "trim" spaces excessively. You can make the regex more complicated, so that it only trims around the delimiter, but the Splitter solution is definitely more readable and simpler to use.
Guava also has (among many other wonderful things) a very convenient Joiner.
System.out.println(
Joiner.on("... ").skipNulls().join("Oh", "My", null, "God")
);
// prints "Oh... My... God"
I think you want replaceAll rather than replace.
And replaceAll("\\s","") will remove all spaces, not just the redundant ones. If that's not what you want, you should try replaceAll("\\s+","\\s") or something like that.
What you wrote does not match the code:
Intention is to take a current line which contains commas, store trimmed values of all space and store the line into the array.
It seams, by the code, that you want all spaces removed and split the resulting string at the commas (not described). That can be done as Paul Tomblin suggested.
String[] currentLineArray = currentInputLine.replaceAll("\\s", "").split(",");
If you want to split at the commas and remove leading and trailing spaces (trim) from the resulting parts, use:
String[] currentLineArray = currentInputLine.trim().split("\\s*,\\s*");
(trim() is needed to remove leading spaces of first part and trailing space from last part)
If you need to perform this operation repeatedly, I'd suggest using java.util.regex.Pattern and java.util.regex.Matcher instead.
final Pattern pattern = Pattern.compile( regex);
for(String inp: inps) {
final Matcher matcher = pattern.matcher( inpString);
return matcher.replaceAll( replacementString);
}
Compiling a regex is a costly operation and using String's replaceAll repeatedly is not recommended, since each invocation involves compilation of regex followed by replacement.

Categories