My characters is "!,;,%,#,**,**,(,)" which get from XML. when I split it with ',', I lost the ','.
How can I do to avoid it.
I have already tried to change the comma to 'C', but it does not work.
Thre result I want is "!,;,%,#,,,(,)", but not "!,;,%,#,,(,)"
String::split use regex so you can split with this regex ((?<!,),|,(?!,)) like this :
String string = "!,;,%,#,,,(,)";
String[] split = string.split("((?<!,),|,(?!,))");
Details
(?<!,), match a comma if not preceded by a comma
| or
,(?!,) match a comma if not followed by a comma
Outputs
!
;
%
#
,
(
)
If you are trying to extract all characters from string, you can do so by using String.toCharArray()[1] :
String str = "sample string here";
char[] char_array = s.toCharArray();
If you just want to iterate over the characters in the string, you can use the character array obtained from above method or do so by using a for loop and str.charAt(i)[2] to access the character at position i.
[1] https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#toCharArray()
[2]https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#charAt(int)
try this, this could be help full. First I replaced the ',' with other string and do split. After complete other string replace with ','
public static void main(String[] args) {
String str = "!,;,%,#,**,**,(,)";
System.out.println(str);
str = str.replace("**,**","**/!/**");
String[] array = str.split(",");
System.out.println(Arrays.stream(array).map(s -> s.replace("**/!/**", ",")).collect(Collectors.toList()));
}
out put
!,;,%,#,**,**,(,)
[!, ;, %, #, ,, (, )]
First, we need to define when the comma is an actual delimiter, and when it is part of a character sequence.
We need to assume that a sequence of commas surrounded by commas is an actual character sequence we want to capture. It can be done with lookarounds:
String s = "!,;,,,%,#,**,**,,,,(,)";
List<String> list = Arrays.asList(s.split(",(?!,)|(?<!,),"));
This regular expression splits by a comma that is either preceded by something that is not a comma, or followed by something that is not a comma.
Note that your formatting string, that is, every character sequence separated by a comma, is a bad design, since you require both the possibility to use a comma as sequence, and the possibility to use multiple characters to be used. That means you can combine them too!
What, for example, if I want to use these two character sequences:
,
,,,,
Then I construct the formatting string like this: ,,,,,,. It is now unclear whether , and ,,,, should be character sequences, or ,, and ,,,.
Related
I have a string below which I want to split in String array with multiple delimiters.
The delimiters are comma (,), semicolon (;), "OR" and "AND".
But I do not want to split on a comma if it's in brackets.
Example input:
device_name==device503,device_type!=GATEWAY;site_name<site3434 OR country==India AND location==BLR; new_name=in=(Rajesh,Suresh)
I am able to split the String with regex, but it doesn't handle commas in brackets correctly.
How can I fix this?
Pattern ptn = Pattern.compile("(,|;|OR|AND)");
String[] parts = ptn.split(query);
for(String p:parts){
System.out.println(p);
queryParams.add(p.trim());
}
You could use a negative look-ahead:.
String[] parts = input.split(",(?![^()]*\\))|;| OR | AND ")
Or an uglier (but perhaps conceptually simpler) way you could do it would be to replace any commas within brackets with a temporary placeholder, then do the split and replace the placeholders with real commas in the results.
String input = "X,Y=((A,B),C) OR Z";
Pattern pattern = Pattern.compile("\\(.*\\)");
Matcher matcher = pattern.matcher(input);
StringBuffer sb = new StringBuffer();
while (matcher.find()) {
matcher.appendReplacement(sb, matcher.group().replaceAll(",", "_COMMA_"));
}
matcher.appendTail(sb);
String[] parts = sb.toString().split("(,|;| OR | AND )");
for (String part : parts) {
System.out.println(part.replace("_COMMA_", ","));
}
Prints:
X
Y=((A,B),C)
Z
Alternatively, you could write your own little tokenizer that reads the input character-by-character using charAt(index) or define a grammar for an off-the-shelf parser.
You can use negative look-ahead (?!...), which looks at the following characters, and if those characters match the pattern in brackets, the overall match will fail.
String query = "device_name==device503,device_type!=GATEWAY;site_name<site3434 OR country==India AND location==BLR; new_name=in=(Rajesh,Suresh)";
String[] parts = query.split("\\s*(,(?![^()]*\\))|;|OR|AND)\\s*");
for(String part: parts)
System.out.println(part);
Output:
device_name==device503
device_type!=GATEWAY
site_name<site3434
country==India
location==BLR
new_name=in=(Rajesh,Suresh)
So in this case we check whether the characters following the , are 0 or more characters which aren't either ( or ), followed by a ), and if this is true, the , match fails.
This won't work if you can have nested brackets.
Note:
String also has a split method (as used above), which is useful for simplicity's sake (but would be slower than reusing the same Pattern over and over again for multiple Strings).
You can add \\s* (0 or more whitespace characters) to your regex to remove any spaces before or after a delimiter.
If you're using | without anything before or after (e.g. "a|b|c"), you don't need to put it in brackets.
String = "9,3,5,*****,1,2,3"
I'd like to simply access "5", which is between two commas, and right before "*****"; then only replace this "5" to other value.
How could I do this in Java?
You can try using the following regex replacement:
String input = "9,3,5,*****,1,2,3";
input = input.replaceAll("[^,]*,\\*{5}", "X,*****");
Here is an explanation of the regex:
[^,]*, match any number of non-comma characters, followed by one comma
\\*{5} followed by five asterisks
This means to match whatever CSV term plus a comma comes before the five asterisks in your string. We then replace this with what you want, along with the five stars in the original pattern.
Demo here:
Rextester
I'd use a regular expression with a lookahead, to find a string of digits that precedes ",*****", and replace it with the new value. The regular expression you're looking for would be \d+(?=,\*{5}) - that is, one or more digits, with a lookahead consisting of a comma and five asterisks. So you'd write
newString = oldString.replaceAll("\\d+(?=,\\*{5})", "replacement");
Here is an explanation of the regex pattern used in the replacement:
\\d+ match any numbers of digits, but only when
(?=,\\*{5}) we can lookahead and assert that what follows immediately
is a single comma followed by five asterisks
It is important to note that the lookahead (?=,\\*{5}) asserts but does not consume. Hence, we can ignore it with regards to the replacement.
I considered newstr be "6"
String str = "9,3,5,*****,1,2,3";
char newstr = '6';
str = str.replace(str.charAt(str.indexOf(",*") - 1), newstr);
Also if you are not sure about str length check for IndexOutOfBoundException
and handle it
You could split on , and then join with a , (after replacing 5 with the desired value - say X). Like,
String[] arr = "9,3,5,*****,1,2,3".split(",");
arr[2] = "X";
System.out.println(String.join(",", arr));
Which outputs
9,3,X,*****,1,2,3
you can use spit() for replacing a string
String str = "9,3,5,*****,1,2,3";
String[] myStrings = str.split(",");
String str1 = myStrings[2];
I have a string which I want to first split by space, and then separate the words from the special characters.
For Example, let's say the input is:
Hi, How are you???
I already wrote the logic to split by space here:
String input = "Hi, How are you???";
String[] words = input.split("\\\\s+");
Now, I want to seperate each word from the special character.
For example: "Hi," to {"Hi", ","} and "you???" to {"you", "???"}
If the string does not end with any special characters, just ignore it.
Can you please help me with the regular expression and code for this in Java?
Following regex should help you out:
(\s+|[^A-Za-z0-9]+)
This is not a java regex, so you need to add a backspace.
It matches on whitespaces \s+ and on strings of characters consisting not of A-Za-z0-9. This is a workaround, since there isn't (or at least I do not know of) a regex for special characters.
You can test this regex here.
If you use this regex with the split function, it will return the words. Not the special characters and whitespaces it machted on.
UPDATE
According to this answer here on SO, java has\P{Alpha}+, which matches any non-alphabetic character. So you could try:
(\s|\P{Alpha})+
I want to separate each word from the special character.
For example: "Hi," to {"Hi", ","} and "you???" to {"you", "???"}
regex to achieve above behavior
String stringToSearch ="Hi, you???";
Pattern p1 = Pattern.compile("[a-z]{0}\\b");
String[] str = p1.split(stringToSearch);
System.out.println(Arrays.asList(str));
output:
[Hi, , , you, ???]
#mike is right...we need to split the sentence on special characters, leaving out the words. Here is the code:
`public static void main(String[] args) {
String match = "Hi, How are you???";
String[] words = match.split("\\P{Alpha}+");
for(String word: words) {
System.out.print(word + " ");
}
}`
I wish to take a string input from the user and extract words or numbers like so:
String problem = "I'm lo#o#king t%o ext!r$act a^ll 6 su*bs(tr]i{ngs.";
String[] solve = {"I'm", "looking", "to", "extract", "all", "6", "substrings"};
Basically, I want to extract numbers and words with complete disregard to punctuation except apostrophes. I know how to get words and strings but I can't seem to figure out this tricky part.
You could do like the below.
String s = "I'm lo#o#king t%o ext!r$act a^ll 6 su*bs(tr]i{ngs.";
String parts[] = s.replaceAll("[^\\s\\w']|(?<!\\b)'|'(?!\\b)", "").split("\\s+");
System.out.println(Arrays.toString(parts));
Output:
[I'm, looking, to, extract, all, 6, substrings]
Explanation:
[^\\s\\w'] matches any character but not of space or single quote or word character.
(?<!\\b)'(?!\\b) matches the ' symbol only if it's not preceded and not followed by a word character.
replaceAll function replaces all the matched characters with an empty string.
Finally we do splitting on the resultant string according to one or more space characters.
I want to remove all Unicode Characters and Escape Characters like (\n, \t) etc. In short I want just alphanumeric string.
For example :
\u2029My Actual String\u2029
\nMy Actual String\n
I want to fetch just 'My Actual String'. Is there any way to do so, either by using a built in string method or a Regular Expression ?
Try
String stg = "\u2029My Actual String\u2029 \nMy Actual String";
Pattern pat = Pattern.compile("(?!(\\\\(u|U)\\w{4}|\\s))(\\w)+");
Matcher mat = pat.matcher(stg);
String out = "";
while(mat.find()){
out+=mat.group()+" ";
}
System.out.println(out);
The regex matches all things except unicode and escape characters. The regex pictorially represented as:
Output:
My Actual String My Actual String
Try this:
anyString = anyString.replaceAll("\\\\u\\d{4}|\\\\.", "");
to remove escaped characters. If you also want to remove all other special characters use this one:
anyString = anyString.replaceAll("\\\\u\\d{4}|\\\\.|[^a-zA-Z0-9\\s]", "");
(I guess you want to keep the whitespaces, if not remove \\s from the one above)