I have this text tokenized as follows:
∅habbaz∅abdelkrim∅habbaz∅abdelkrim∅habbaz∅abdelkrim
I want to get every string between the character ∅. I have tried the following:
ArrayList<String> ta = new ArrayList();
String test=t2.getText();
String str = test;
Pattern pattern = Pattern.compile("∅(.*?)∅");
Matcher matcher = pattern.matcher(str);
while (matcher.find()) {
ta.add(matcher.group(1));
}
t3.setText(ta.toString());
It's supposed to give me:
[habbaz,abdelkrim, habbaz,abdelkrim, habbaz,abdelkrim]
But it's giving me only:
[habbaz, habbaz, habbaz]
If you want to go with the regex solution, try this:
Pattern pattern = Pattern.compile("∅([^∅]*)");
This pattern will match a ∅ followed by any number of non-∅, which should do the trick.
Use split:
String input = "∅habbaz∅abdelkrim∅habbaz∅abdelkrim∅habbaz∅abdelkrim";
String[] tokens = input.split("∅");
This will produce an array of those strings that are between your delimiter. Note that the first string in the array will be "", the empty string, because your input string starts with the delimiter ∅. To avoid this, take a substring of the input right before you split (if (input.startsWith("∅")) {input = input.substring(1);}), or process the resulting tokens to exclude any empty strings.
To turn the tokens into your ArrayList, use the following:
ArrayList ta = new ArrayList<Element>(Arrays.asList(tokens))
Or you could just write:
List ta = Arrays.asList(input.split("∅"));
Related
I can't seem to find an answer for this one. What I want to do is to split a string in Java, but I want to keep the delimiters inside each string. For example, if I had the following string:
word1{word2}[word3](word4)"word5"'word6'
The array of new strings would have to be something like this:
["word1", "{word2}", "[word3]", "(word4)", "\"word5\"", "\'word6\'"]
How can I achieve this throughout Regex or other form? I'm still learning Regex in Java, so I tried some things, as discussed in here for example: How to split a string, but also keep the delimiters?
but I'm not getting the results I expect.
I have this delimiter:
static public final String WITH_DELIMITER = "((?<=%1$s)|(?=%1$s))";
And then this method:
private String[] splitLine() { return tokenFactor.split(String.format(WITH_DELIMITER, "\\(|\\)|\\[|\\]|\\{|\\}|\"|\'")); }
But that code splits the delimiters as individual strings, which is not what I want
Can anyone please help me?!! Thanks!
A solution using Pattern and regex :
I will catch every word alone, or words with one element before and after the String
String str = "word1{word2}[word3](word4)\"word5\"'word6'";
Matcher m = Pattern.compile("(([{\\[(\"']\\w+[}\\])\"'])|(\\w+))").matcher(str);
List<String> matches = new ArrayList<>();
while (m.find())
matches.add(m.group());
String[] matchesArray = matches.toArray(new String[0]);
System.out.println(Arrays.toString(matchesArray));
I gave the way to have it in the an array, bu you can stop with the list
Regex demo
I am getting a value as list of string in string format like this: "["a", "b"]". I would like to convert them to a list of strings. I can do this by stripping the leading and trailing braces and then splitting on comma. But here the problem is that I may receive the same value as single string also "a" that too I want to convert to a list of strings. So is there any way to generalize this.
One possible solution is to use Regex.
Your expression can look like this: "(.+?)"
.+? matches any character (except for line terminators)
+? Quantifier - Matches between one and unlimited times, as few times as possible, expanding as needed.
String tokens = "[\"a\", \"b,c\", \"test\"]";
String pattern = "\"(.+?)\"";
Pattern p = Pattern.compile(pattern);
Matcher m = p.matcher(tokens);
List<String> tokenList = new ArrayList<String>();
while (m.find()) {
tokenList.add(m.group());
}
System.out.println(tokenList);
you can generalize the following:
String str = "\"[\"a\",\"b\"]\"";
String[] splitStrs = str.split("\"",7);
System.out.println(splitStrs[0]+" "+splitStrs[1]+" "+splitStrs[2]+" "+splitStrs[3]+" "+splitStrs[4]+" "+splitStrs[5]+" "+splitStrs[6]);
My output
[ a , b ]
I want to split a sentence having spaces or any special character into an array of words with spaces or special character also an element of array.
Sentence like:
aman,amit and sumit went to top-up
should be split into an array of String:
{"aman",",","amit"," ","and"," ","sumit"," ","went"," ","to"," ","top","-","up")
Please suggest any regex or logic to split the same using java.
I missed one thing in my question. I also need to split on numeric character as well.. But using split("\b") does not split a string having something like
abc12def
into
{ "abc", "12","def") or {"abc","1","2","def")
It seems all you need is to match either word characters (\w+) or non-word ones (\W+). Combine these with an alternation operator and - perhaps - add a Pattern.UNICODE_CHARACTER_CLASS (or its inline/embedded version (?U)) to make the pattern Unicode-aware:
String value = "aman,amit and sumit went to top-up";
String pattern = "(?U)\\w+|\\W+";
List<String> lst = new ArrayList<>();
Pattern p = Pattern.compile(pattern);
Matcher m = p.matcher(value);
while (m.find())
lst.add(m.group(0));
System.out.println(lst);
See the Java demo
I hope the below code snippet helps you solve this.
public static void main(final String[] args) {
String message = "aman,amit and sumit went to top-up";
String[] messages = message.split("\\b");
for(String string : messages) {
System.out.println(string);
}
}
I have this code:
String s = "bla mo& lol!";
Pattern p = Pattern.compile("[a-z]");
String[] allWords = p.split(s);
I'm trying to get all the words according to this specific pattern into an array.
But I get all the opposite.
I want my array to be:
allWords = {bla, mo, lol}
but I get:
allWords = { ,& ,!}
Is there any fast solution or do I have to use the matcher and a while loop to insert it
into an array?
Pattern p = Pattern.compile("[a-z]");
p.split(s);
means all [a-z] would be separator, not array elements. You may want to have:
Pattern p = Pattern.compile("[^a-z]+");
You are splitting s AT the letters. split uses for delimiters, so change your pattern
[^a-z]
The split method is given a delimiter, which is your Pattern.
It's the inverted syntax, yet the very same mechanism of String.split, wherein you give a Pattern representation as argument, which will act as delimiter as well.
Your delimiter being a character class, that is the intended result.
If you only want to keep words, try this:
String s = "bla mo& lol!";
// | will split on
// | 1 + non-word
// | characters
Pattern p = Pattern.compile("\\W+");
String[] allWords = p.split(s);
System.out.println(Arrays.toString(allWords));
Output
[bla, mo, lol]
One simple way is:
String[] words = s.split("\\W+");
how can i filter string?
String[] filterTags={<A>,<BC>,<A,<B};
filterTags can contains more values it can contains some string, numeric or anything can grow dyanmically
String name="<A><ABC><B><B"
what i want is remove values from filterTags array but keep <ABC> as it is from the name(String)
if (name.contains(filterTags[i])and ???)
i need just a simple check which will remove the filterTags values if contains in name (String) but keep as it is
thank you in advanced
Well, you can do this with a regex:
String filtered = name.replaceAll("(<A>|<BC>|<A|<B)", "");
// filtered == "<ABC>"
The problem is now to create that regex String. You can hardcode it, since it looks like that's what you're doing with the array anyways, or you could do something like this:
StringBuilder sb = new StringBuilder("(");
for (String token : filterTags) {
sb.append(token);
sb.append('|');
}
sb.deleteCharAt(sb.length() - 1); // Remove the last "|"
sb.append(')');
String regex = sb.toString();
Note that this will only work if your filter tags don't contain any regex special characters
Simply you Can do this with regex
String name = "<A><ABC><c<B<s";
pattern = Pattern.compile(".*(<ABC>).*");
matcher = pattern.matcher(name);
matcher.matches();
System.out.println(matcher.group(1)); // prints <ABC>