My characters is "!,;,%,#,**,**,(,)" which get from XML. when I split it with ',', I lost the ','.
How can I do to avoid it.
I have already tried to change the comma to 'C', but it does not work.
Thre result I want is "!,;,%,#,,,(,)", but not "!,;,%,#,,(,)"
String::split use regex so you can split with this regex ((?<!,),|,(?!,)) like this :
String string = "!,;,%,#,,,(,)";
String[] split = string.split("((?<!,),|,(?!,))");
Details
(?<!,), match a comma if not preceded by a comma
| or
,(?!,) match a comma if not followed by a comma
Outputs
!
;
%
#
,
(
)
If you are trying to extract all characters from string, you can do so by using String.toCharArray()[1] :
String str = "sample string here";
char[] char_array = s.toCharArray();
If you just want to iterate over the characters in the string, you can use the character array obtained from above method or do so by using a for loop and str.charAt(i)[2] to access the character at position i.
[1] https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#toCharArray()
[2]https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#charAt(int)
try this, this could be help full. First I replaced the ',' with other string and do split. After complete other string replace with ','
public static void main(String[] args) {
String str = "!,;,%,#,**,**,(,)";
System.out.println(str);
str = str.replace("**,**","**/!/**");
String[] array = str.split(",");
System.out.println(Arrays.stream(array).map(s -> s.replace("**/!/**", ",")).collect(Collectors.toList()));
}
out put
!,;,%,#,**,**,(,)
[!, ;, %, #, ,, (, )]
First, we need to define when the comma is an actual delimiter, and when it is part of a character sequence.
We need to assume that a sequence of commas surrounded by commas is an actual character sequence we want to capture. It can be done with lookarounds:
String s = "!,;,,,%,#,**,**,,,,(,)";
List<String> list = Arrays.asList(s.split(",(?!,)|(?<!,),"));
This regular expression splits by a comma that is either preceded by something that is not a comma, or followed by something that is not a comma.
Note that your formatting string, that is, every character sequence separated by a comma, is a bad design, since you require both the possibility to use a comma as sequence, and the possibility to use multiple characters to be used. That means you can combine them too!
What, for example, if I want to use these two character sequences:
,
,,,,
Then I construct the formatting string like this: ,,,,,,. It is now unclear whether , and ,,,, should be character sequences, or ,, and ,,,.
What I need is to escape each word in a string and escape each special char like: !,?._'#. What I've tried is this:
public class Solution
{
public static void main(String[] args)
{
Scanner scan = new Scanner(System.in);
Pattern pat = Pattern.compile("[!|,|?|.|_|'|#]");
String a = scan.nextLine();
scan.close();
String[] part = pat.split(a);
System.out.println(part.length);
for(String p: part)
System.out.println(p);
}
}
While this does escape the special characters, I can't manage to find a way to have the regex match the spaces between each word.
Also, I've tried using \s and \\s after the regex.
For input like: The dog is a very lazy dog, isn't he?
output should be:
The
dog
is
a
very
lazy
dog
isn
t
he
[..] is character class which describes range for single character, not two characters (we can allow repetition of characters with quantifiers like + * {nim,max} but that is not the case here).
Also you don't need to use | inside [..] because there it is simple character, not OR operator. So [a|b] doesn't mean a OR b, it represents characters a | b (so any repetition of | like |c will represent another | and c).
Based on example you provided, you may be looking for:
Pattern pat = Pattern.compile("[!,?._'#\\s]+");
or since this may be more readable
Pattern pat = Pattern.compile("([!,?._'#]|\\s)+");
You would need to use OR operator | outside of [..] and write \s as "\\s since \ is also special character in String literals (it can be used for instance to create tab character \t) so it requires escaping.
I wrapped entire expression with (..) to create group which can represent all your delimiters. This allowed me to use + (quantifier representing "one or more occurrences") so now you regex can see ,. as single delimiter for split, which will ensure one split on entire expression of few continuous delimiter, rather then splitting on each of them separately. So instead of "a,.b" -> ["a, "", "b"] now we will get ["a", "b"]
I need to add spaces between all punctuation in a string.
\\ "Hello: World." -> "Hello : World ."
\\ "It's 9:00?" -> "It ' s 9 : 00 ?"
\\ "1.B,3.D!" -> "1 . B , 3 . D !"
I think a regex is the way to go, matching all non-punctuation [a-ZA-Z\\d]+, adding a space before and/or after, then extracting the remainder matching all punctuation [^a-ZA-Z\\d]+.
But I don't know how to (recursively?) call this regex. Looking at the first example, the regex will only match the "Hello". I was thinking of just building a new string by continuously removing and appending the first instance of the matched regex, while the original string is not empty.
private String addSpacesBeforePunctuation(String s) {
StringBuilder builder = new StringBuilder();
final String nonpunctuation = "[a-zA-Z\\d]+";
final String punctuation = "[^a-zA-Z\\d]+";
String found;
while (!s.isEmpty()) {
// regex stuff goes here
found = ???; // found group from respective regex goes here
builder.append(found);
builder.append(" ");
s = s.replaceFirst(found, "");
}
return builder.toString().trim();
}
However this doesn't feel like the right way to go... I think I'm over complicating things...
You can use lookarounds based regex using punctuation property \p{Punct} in Java:
str = str.replaceAll("(?<=\\S)(?:(?<=\\p{Punct})|(?=\\p{Punct}))(?=\\S)", " ");
(?<=\\S) Asserts if prev char is not a white-space
(?<=\\p{Punct}) asserts a position if previous char is a punctuation char
(?=\\p{Punct}) asserts a position if next char is a punctuation char
(?=\\S) Asserts if next char is not a white-space
IdeOne Demo
When you see a punctuation mark, you have four possibilities:
Punctuation is surrounded by spaces
Punctuation is preceded by a space
Punctuation is followed by a space
Punctuation is neither preceded nor followed by a space.
Here is code that does the replacement properly:
String ss = s
.replaceAll("(?<=\\S)\\p{Punct}", " $0")
.replaceAll("\\p{Punct}(?=\\S)", "$0 ");
It uses two expressions - one matching the number 2, and one matching the number 3. Since the expressions are applied on top of each other, they take care of the number 4 as well. The number 1 requires no change.
Demo.
I want to remove that characters from a String:
+ - ! ( ) { } [ ] ^ ~ : \
also I want to remove them:
/*
*/
&&
||
I mean that I will not remove & or | I will remove them if the second character follows the first one (/* */ && ||)
How can I do that efficiently and fast at Java?
Example:
a:b+c1|x||c*(?)
will be:
abc1|xc*?
This can be done via a long, but actually very simple regex.
String aString = "a:b+c1|x||c*(?)";
String sanitizedString = aString.replaceAll("[+\\-!(){}\\[\\]^~:\\\\]|/\\*|\\*/|&&|\\|\\|", "");
System.out.println(sanitizedString);
I think that the java.lang.String.replaceAll(String regex, String replacement) is all you need:
http://docs.oracle.com/javase/6/docs/api/java/lang/String.html#replaceAll(java.lang.String, java.lang.String).
there is two way to do that :
1)
ArrayList<String> arrayList = new ArrayList<String>();
arrayList.add("+");
arrayList.add("-");
arrayList.add("||");
arrayList.add("&&");
arrayList.add("(");
arrayList.add(")");
arrayList.add("{");
arrayList.add("}");
arrayList.add("[");
arrayList.add("]");
arrayList.add("~");
arrayList.add("^");
arrayList.add(":");
arrayList.add("/");
arrayList.add("/*");
arrayList.add("*/");
String string = "a:b+c1|x||c*(?)";
for (int i = 0; i < arrayList.size(); i++) {
if (string.contains(arrayList.get(i)));
string=string.replace(arrayList.get(i), "");
}
System.out.println(string);
2)
String string = "a:b+c1|x||c*(?)";
string = string.replaceAll("[+\\-!(){}\\[\\]^~:\\\\]|/\\*|\\*/|&&|\\|\\|", "");
System.out.println(string);
Thomas wrote on How to remove special characters from a string?:
That depends on what you define as special characters, but try
replaceAll(...):
String result = yourString.replaceAll("[-+.^:,]","");
Note that the ^ character must not be the first one in the list, since
you'd then either have to escape it or it would mean "any but these
characters".
Another note: the - character needs to be the first or last one on the
list, otherwise you'd have to escape it or it would define a range (
e.g. :-, would mean "all characters in the range : to ,).
So, in order to keep consistency and not depend on character
positioning, you might want to escape all those characters that have a
special meaning in regular expressions (the following list is not
complete, so be aware of other characters like (, {, $ etc.):
String result = yourString.replaceAll("[\\-\\+\\.\\^:,]","");
If you want to get rid of all punctuation and symbols, try this regex:
\p{P}\p{S} (keep in mind that in Java strings you'd have to escape
back slashes: "\p{P}\p{S}").
A third way could be something like this, if you can exactly define
what should be left in your string:
String result = yourString.replaceAll("[^\\w\\s]","");
Here's less restrictive alternative to the "define allowed characters"
approach, as suggested by Ray:
String result = yourString.replaceAll("[^\\p{L}\\p{Z}]","");
The regex matches everything that is not a letter in any language and
not a separator (whitespace, linebreak etc.). Note that you can't use
[\P{L}\P{Z}] (upper case P means not having that property), since that
would mean "everything that is not a letter or not whitespace", which
almost matches everything, since letters are not whitespace and vice
versa.
I want to split the following string "Good^Evening" i used split option it is not split the value. please help me.
This is what I've been trying:
String Val = "Good^Evening";
String[] valArray = Val.Split("^");
I'm assuming you did something like:
String[] parts = str.split("^");
That doesn't work because the argument to split is actually a regular expression, where ^ has a special meaning. Try this instead:
String[] parts = str.split("\\^");
The \\ is really equivalent to a single \ (the first \ is required as a Java escape sequence in string literals). It is then a special character in regular expressions which means "use the next character literally, don't interpret its special meaning".
The regex you should use is "\^" which you write as "\\^" as a Java String literal; i.e.
String[] parts = "Good^Evening".split("\\^");
The regex needs a '\' escape because the caret character ('^') is a meta-character in the regex language. The 2nd '\' escape is needed because '\' is an escape in a String literal.
try this
String str = "Good^Evening";
String newStr = str.replaceAll("[^]+", "");