Splitting string on the basis of keyword - java

I have string
String x="http://www.allindiaflorist.com/imgs/arrangemen4.jpg***http://storyofpakistan.com/wp-content/uploads/2011/11/Rukn-AlaminMultan.jpg***" ;
I want to extract string on the basis of *** so I should get array of size 2,
I am doing this,.
String[] explode=a.split("//***");
img1=explode[0]; //`it gives java.util.regex.patternSyntaxException`
I also tried
String[] explode=a.split("***");
img1=explode[0]; //`it gives java.util.regex.patternSyntaxException`
I am ok to write my custom generic function that can search for *** but I want to why split() is not working
Thanks

Use Pattern#quote:
String[] explode=a.split(Pattern.quote("***"));
Now you don't have to break your head on what special character you need to escape. The method "returns a literal pattern String for the specified String".
(For the sake of clarification, you're getting the error because you should escape each *).

Use regex [*]{3}.Try,
String x="htt.....
String arr[] =x.split("[*]{3}");

String str = "http://www.allindiaflorist.com/imgs/arrangemen4.jpg***http://storyofpakistan.com/wp-content/uploads/2011/11/Rukn-AlaminMultan.jpg***";
String delim = "\\*\\*\\*";
String[] arr= str.split(delim);
System.out.println(arr[0]);
System.out.println(arr[1]);
output
http://www.allindiaflorist.com/imgs/arrangemen4.jpg
http://storyofpakistan.com/wp-content/uploads/2011/11/Rukn-AlaminMultan.jpg

You can try this:
String[] explode=a.split("\\Q***\\E");
\Q Start quoting the regex.
\E End quoting the regex.
Basically, between \Q and \E the metacharacter * will be considered as a plain character (ie *) with no special meaning.

Escape * using \\
String[] arr=x.split("\\*\\*\\*");

try this code it will work you have given wrong regx pattern.
it should be inside[]
public static void main(String args[])
{
String x="http://www.allindiaflorist.com/imgs/arrangemen4.jpg***http://storyofpakistan.com/wp-content/uploads/2011/11/Rukn-AlaminMultan.jpg***" ;
String[] explode=x.split("[***]");
String img1=explode[0];
System.out.println(img1);
}

Related

Java split with special characters

I have below code that doing a split for string using <div>\\$\\$PZ\\$\\$</div> and it's not working using the special characters.
public class HelloWorld{
public class HelloWorld{
public static void main(String []args){
String str = "test<div>\\$\\$PZ\\$\\$</div>test";
String[] arrOfStr = str.split("<div>\\$\\$PZ\\$\\$</div>", 2);
for (String a : arrOfStr)
System.out.println(a);
}
}
the output os test<div>\$\$PZ\$\$</div>test
it works when I remove the special characters
Can you please help.
As you already know, the parameter to split(...) is a regular expression, so some characters have special meaning. If you want the parameter to be treated literally, i.e. not as a regex, call the Pattern.quote(String s) method.
Example
String str = "test<div>\\$\\$PZ\\$\\$</div>test";
String[] arrOfStr = str.split(Pattern.quote("<div>\\$\\$PZ\\$\\$</div>"), 2);
for (String a : arrOfStr)
System.out.println(a);
Output
test
test
The quote() method simply surrounds the literal text with the regex \Q...\E quotation pattern1, e.g. your <div>\$\$PZ\$\$</div> text becomes:
\Q<div>\$\$PZ\$\$</div>\E
For fixed text you could just do that yourself, i.e. the following 3 versions all create the same regex to split on:
str.split(Pattern.quote("<div>\\$\\$PZ\\$\\$</div>"), 2)
str.split("\\Q<div>\\$\\$PZ\\$\\$</div>\\E", 2)
str.split("<div>\\\\\\$\\\\\\$PZ\\\\\\$\\\\\\$</div>", 2)
To me, the 3rd one, using \ to escape, is the least readable/desirable version.
If there is a lot of special characters to escape, using \Q...\E is easier than \-escaping all the special characters separately, but very few people use it, so it's fairly unknown to most.
The quote() method is especially useful when you need to treat dynamic text literally, e.g. when the text to split on is configurable by the user.
1) quote() will correctly handle literal text containing \E.
This:
String str = "test<div>\\$\\$PZ\\$\\$</div>test";
String[] arrOfStr = str.split("<div>\\\\\\$\\\\\\$PZ\\\\\\$\\\\\\$</div>", 2);
for (String a : arrOfStr) {
System.out.println(a);
}
prints:
test
test
EDIT: Why do we need all those backslashes? It's because of how we need to handle String literals representing regex expressions. This page describes the reason with examples. The essence is this:
For a backslash \...
...the pattern to match that would be \\... (to escape the escape)
... but the string literal to create that pattern would have to have one backslash to escape each of the two backslashes: \\\\.
Add to that the original need to also escape the $, that gives us our 6 backslashes in the string representation.

Regex to remove only special characters and not other language letters

I used a regex expression to remove special characters from name. The expression will remove all letters except English alphabets.
public static void main(String args[]) {
String name = "Özcan Sevim.";
name = name.replaceAll("[^a-zA-Z\\s]", " ").trim();
System.out.println(name);
}
Output:
zcan Sevim
Expected Output:
Özcan Sevim
I get bad result as I did it this way, the right way will be to remove special characters based on ASCII codes so that other letters will not be removed, can someone help me with a regex that would remove only special characters.
You can use \p{IsLatin} or \p{IsAlphabetic}
name = name.replaceAll("[^\\p{IsLatin}]", " ").trim();
Or to remove the punctuation just use \p{Punct} like this :
name = name.replaceAll("\\p{Punct}", " ").trim();
Outputs
Özcan Sevim
take a look at the full list of Summary of regular-expression constructs and use the one which can help you.
Use Guava CharMatcher for that :) It will be easier to read and maintain it.
name = CharMatcher.ASCII.negate().removeFrom(name);
use [\W+] or "[^a-zA-Z0-9]" as regex to match any special characters and also use String.replaceAll(regex, String) to replace the spl charecter with an empty string. remember as the first arg of String.replaceAll is a regex you have to escape it with a backslash to treat em as a literal charcter.
String string= "hjdg$h&jk8^i0ssh6";
Pattern pt = Pattern.compile("[^a-zA-Z0-9]");
Matcher match= pt.matcher(string);
while(match.find())
{
String s= match.group();
string=string.replaceAll("\\"+s, "");
}
System.out.println(string);

Split string against some characters except the # character

I want to split a string against the following characters
~!#$%^&*()_+­=<>,.?/:;"'{}|[]\, \n,\t, space
I tried to use \\s regex delimiter but i don't want the # included as the split character so that a string like this is #funny should result to this is #funny as the resulting values.
I have tried the following but it doesn't work.
this is #funny".split("\\s")
but it doesn't work. Any ideas?
Just specify the characters you want in square bracket, which means any of. Single escape Java characters (like \") and double escape Regex special characters (like \\[):
#Test
public void testName() throws Exception
{
String[] split = "this is #funny".split("[~!#$%^&*()_+­=<>,.?/:;\"'{}|\\[\\]\\\\ \\n\\t]");
for (String string : split)
{
logger.debug(string);
}
}
User replaceAll(String regex,String replacement) method from String.
String result = "this is #funny".replaceAll("[~!#$%^&*()_+­=<>,.?/:;\"'{}|\\[\\]\\,\\n\\t]", "");
System.out.println(result);
You can try to implement this:
String[] split = "this&is%a#funny^string".split("[^#\\p{Alnum}]|\\s+");
for (String string : split){
System.out.println(string);
}
Also check the Java API (Patterns) for more information on how to process strings.
It look like this will work for you:
String[] split = str.split("[^a-zA-Z&&[^#]]+");
This uses a character class subtraction to split on non-letter chars, except the hash.
Here's some test code:
String str = "this is #funny";
String[] split = str.split("[^a-zA-Z&&[^#]]+");
System.out.println(Arrays.toString(split));
Output:
[this, is, #funny]

Split method returning blank array in Java

I'm having a problem with the split function returning a blank array. When any index is called from the array, it throws an index out of bounds exception. Here is the code:
class splitNumber {
public static void main(String[] args) {
String number = "3.84";
String[] sep = number.split(".");
System.out.println(sep[0]);
}
}
Is there any fix or workaround for this? I'm using Java SE 7.
As noted elsewhere, String#split takes a regex. One alternate way to construct that regex is to use Pattern#quote:
String number = "3.84";
String[] sep = number.split(Pattern.quote("."));
System.out.println(Arrays.toString(sep));
This saves you from typing a bunch of tedious escape chars.
"." is a special character in regex and String's split method accepts regex, which you need to escape like:
String[] sep = number.split("\\.");
Use "\."
"Note that this takes a regular expression, so remember to escape special characters if necessary, e.g. if you want to split on period . which means "any character" in regex, use either split("\.") or split(Pattern.quote("."))."
How to split a string in Java

How to replace all numbers in java string

I have string like this String s="ram123",d="ram varma656887"
I want string like ram and ram varma so how to seperate string from combined string
I am trying using regex but it is not working
PersonName.setText(cursor.getString(cursor.getColumnIndex(cursor
.getColumnName(1))).replaceAll("[^0-9]+"));
The correct RegEx for selecting all numbers would be just [0-9], you can skip the +, since you use replaceAll.
However, your usage of replaceAll is wrong, it's defined as follows: replaceAll(String regex, String replacement). The correct code in your example would be: replaceAll("[0-9]", "").
You can use the following regex: \d for representing numbers. In the regex that you use, you have a ^ which will check for any characters other than the charset 0-9
String s="ram123";
System.out.println(s);
/* You don't need the + because you are using the replaceAll method */
s = s.replaceAll("\\d", ""); // or you can also use [0-9]
System.out.println(s);
To remove the numbers, following code will do the trick.
stringname.replaceAll("[0-9]","");
Please do as follows
String name = "ram varma656887";
name = name.replaceAll("[0-9]","");
System.out.println(name);//ram varma
alternatively you can do as
String name = "ram varma656887";
name = name.replaceAll("\\d","");
System.out.println(name);//ram varma
also something like given will work for you
String given = "ram varma656887";
String[] arr = given.split("\\d");
String data = new String();
for(String x : arr){
data = data+x;
}
System.out.println(data);//ram varma
i think you missed the second argument of replace all. You need to put a empty string as argument 2 instead of actually leaving it empty.
try
replaceAll(<your regexp>,"")
you can use Java - String replaceAll() Method.
This method replaces each substring of this string that matches the given regular expression with the given replacement.
Here is the syntax of this method:
public String replaceAll(String regex, String replacement)
Here is the detail of parameters:
regex -- the regular expression to which this string is to be matched.
replacement -- the string which would replace found expression.
Return Value:
This method returns the resulting String.
for your question use this
String s = "ram123", d = "ram varma656887";
System.out.println("s" + s.replaceAll("[0-9]", ""));
System.out.println("d" + d.replaceAll("[0-9]", ""));

Categories