remove chunks of word from string

remove chunks of word from string - java

In java i have to cut word "getenforce" from string.
problem is the word I receive is sometimes cut off. For example i receive "etenforce", or "tenforc".
I could assume at least 4 letters will come in and filter it like that:
//st ---> this is string
st = st.replace("getenforce", "");
st = st.replace("gete", "");
st = st.replace("eten", "");
st = st.replace("tenf", "");
...
st = st.replace("orce", "");
is there some better, more elegant way?

You can use a for loop instead of doing this line by line.
String theWord = "getenforce";
st = st.replace(theWord, "");
//check all the sequences in loop
for(int i=0; i<theWord.length()-3;i++){
st=st.replace(theWord.subSequence(i, i+4), "");
}

I believe this will resolve your query
List<String> strings = Arrays.asList("your sentence with word gete".split(" "));
List<String> filtered = strings.stream().filter(s1 -> !s1.contains("gete")).collect(Collectors.toList());

Related

Replace a set of substring in a string in more efficient way?

I've to replace a set of substrings in a String with another substrings for example
"^t" with "\t"
"^=" with "\u2014"
"^+" with "\u2013"
"^s" with "\u00A0"
"^?" with "."
"^#" with "\\d"
"^$" with "[a-zA-Z]"
So, I've tried with:
String oppip = "pippo^t^# p^+alt^shefhjkhfjkdgfkagfafdjgbcnbch^";
Map<String,String> tokens = new HashMap<String,String>();
tokens.put("^t", "\t");
tokens.put("^=", "\u2014");
tokens.put("^+", "\u2013");
tokens.put("^s", "\u00A0");
tokens.put("^?", ".");
tokens.put("^#", "\\d");
tokens.put("^$", "[a-zA-Z]");
String regexp = "^t|^=|^+|^s|^?|^#|^$";
StringBuffer sb = new StringBuffer();
Pattern p = Pattern.compile(regexp);
Matcher m = p.matcher(oppip);
while (m.find())
m.appendReplacement(sb, tokens.get(m.group()));
m.appendTail(sb);
System.out.println(sb.toString());
But it doesn't work. tokens.get(m.group()) throws an exception.
Any idea why?

You don't have to use a HashMap. Consider using simple arrays, and a loop:
String oppip = "pippo^t^# p^+alt^shefhjkhfjkdgfkagfafdjgbcnbch^";
String[] searchFor =
{"^t", "^=", "^+", "^s", "^?", "^#", "^$"},
replacement =
{"\\t", "\\u2014", "\\u2013", "\\u00A0", ".", "\\d", "[a-zA-Z]"};
for (int i = 0; i < searchFor.length; i++)
oppip = oppip.replace(searchFor[i], replacement[i]);
// Print the result.
System.out.println(oppip);
Here is an online code demo.
For the completeness, you can use a two-dimensional array for a similar approach:
String oppip = "pippo^t^# p^+alt^shefhjkhfjkdgfkagfafdjgbcnbch^";
String[][] tasks =
{
{"^t", "\\t"},
{"^=", "\\u2014"},
{"^+", "\\u2013"},
{"^s", "\\u00A0"},
{"^?", "."},
{"^#", "\\d"},
{"^$", "[a-zA-Z]"}
};
for (String[] replacement : tasks)
oppip = oppip.replace(replacement[0], replacement[1]);
// Print the result.
System.out.println(oppip);

In regex the ^ means "begin-of-text" (or "not" within a character class as negation). You have to place a backslash before it, which becomes two backslashes in a java String.
String regexp = "\\^[t=+s?#$]";
I have reduced it a bit further.

String Tokenizer missing 2 values off array

I am taking creating a StringTokenizer like so and populating an ArrayList using the tokens:
LogUtils.log("saved emails: " + savedString);
StringTokenizer st = new StringTokenizer(savedString, ",");
mListEmailAddresses = new ArrayList<String>();
for (int i = 0; i < st.countTokens(); i++) {
String strEmail = st.nextToken().toString();
mListEmailAddresses.add(strEmail);
}
LogUtils.log("mListEmailAddresses: emails: " + mListEmailAddresses.toString());
11-20 09:56:59.518: I/test(6794): saved emails: hdhdjdjdjd,rrfed,ggggt,tfcg,
11-20 09:56:59.518: I/test(6794): mListEmailAddresses: emails: [hdhdjdjdjd, rrfed]
As you can see mListEmailAddresses is missing 2 values off the end of the array. What should I do to fix this. From my eyes the code looks correct but maybe I am misunderstanding something.
Thanks.

using hasMoreTokens is the solution
while(st.hasMoreTokens()){
String strEmail = st.nextToken().toString();
mListEmailAddresses.add(strEmail);
}

Use the following while loop
StringTokenizer st = new StringTokenizer(savedString, ",");
mListEmailAddresses = new ArrayList<String>();
while (st.hasMoreTokens()) {
String strEmail = st.nextToken();
mListEmailAddresses.add(strEmail);
}
Note, you don't need to call toString, nextToken will return the string.
Alternatively, you could use the split method
String[] tokens = savedString.split(",");
mListEmailAddresses = new ArrayList<String>();
mListEmailAddresses.addAll(Arrays.asList(tokens));
Note, the API docs for StringTokenizer state:
StringTokenizer is a legacy class that is retained for compatibility
reasons although its use is discouraged in new code. It is recommended
that anyone seeking this functionality use the split method of String
or the java.util.regex package instead.

st.countTokens() method calculates the number of times that this tokenizer's nextToken() method can be called before it generates an exception. The current position is not advanced.
To get all elements in ArrayList you should use following code
while(st.hasMoreTokens()) {
String strEmail = st.nextToken().toString();
mListEmailAddresses.add(strEmail);
}

Android: split a string considering 2 separating characters

I have a string containing messages. The string looks like this:
bill:hello;tom:hi;bill:how are you?;tommy:hello!; ...
I need to split the string into several srings, on the characters : and ;.
For now, I have split the string on ; and i could add the results in list elements.
List<Message> listMessages = new ArrayList<Message>();
StringTokenizer tokenizer = new StringTokenizer(messages, ";");
String result = null;
String uname = "";
String umess = "";
while (tokenizer.hasMoreTokens()) {
result = tokenizer.nextToken();
listMessages.add(new Message(result, ""));
}
I still have to do this on the : to have the two resulting strings in my list element, and I tried something like that:
List<Message> listMessages = new ArrayList<Message>();
StringTokenizer tokenizer = new StringTokenizer(messages, ";");
String result = null;
String uname = "";
String umess = "";
while (tokenizer.hasMoreTokens()) {
result = tokenizer.nextToken().split(":");
uname = result[0];
umess = result[1];
listMessages.add(new Message(result[0], result[1]));
}
But I got this error, that I don't understand?
01-23 17:12:19.168: E/AndroidRuntime(711): java.lang.RuntimeException: Unable to start activity ComponentInfo{com.example.appandroid/com.example.appandroid.ListActivity}: java.lang.ArrayIndexOutOfBoundsException: length=1; index=1
Thanks in advance to look at my problem.

Instead of using StringTokenizer, you can use String.split(regex) to split based on two delimiters like below:
String test="this: bill:hello;tom:hi;bill:how are you?;tommy:hello!;";
String[] arr = test.split("[:;]");
for(String s: arr){
System.out.println(s);
}
Output:
this
bill
hello
tom
hi
bill
how are you?
tommy
hello!
EDIT:
from #njzk2 comments if you just wanna use StringTokenizer you can use one of its overloaded constructor which takes 2 args .
StringTokenizer str = new StringTokenizer(test, ":;");

Cut ':' && " " from a String with a tokenizer

right now I am a little bit confused. I want to manipulate this string with a tokenizer:
Bob:23456:12345 Carl:09876:54321
However, I use a Tokenizer, but when I try:
String signature1 = tok.nextToken(":");
tok.nextToken(" ")
I get:
12345 Carl
However I want to have the first int and the second int into a var.
Any ideas?

You have two different patterns, maybe you should handle both separated.
Fist you should split the space separated values. Only use the string split(" "). That will return a String[].
Then for each String use tokenizer.
I believe will works.
Code:
String input = "Bob:23456:12345 Carl:09876:54321";
String[] words = input.split(" ")
for (String word : words) {
String[] token = each.split(":");
String name = token[0];
int value0 = Integer.parseInt(token[1]);
int value1 = Integer.parseInt(token[2]);
}

Following code should do:
String input = "Bob:23456:12345 Carl:09876:54321";
StringTokenizer st = new StringTokenizer(input, ": ");
while(st.hasMoreTokens())
{
String name = st.nextToken();
String val1 = st.nextToken();
String val2 = st.nextToken();
}

Seeing as you have multiple patterns, you cannot handle them with only one tokenizer.
You need to first split it based on whitespace, then split based on the colon.
Something like this should help:
String[] s = "Bob:23456:12345 Carl:09876:54321".split(" ");
System.out.println(Arrays.toString(s ));
String[] so = s[0].split(":", 2);
System.out.println(Arrays.toString(so));
And you'd get this:
[Bob:23456:12345, Carl:09876:54321]
[Bob, 23456:12345]

If you must use tokeniser then I tink you need to use it twice
String str = "Bob:23456:12345 Carl:09876:54321";
StringTokenizer spaceTokenizer = new StringTokenizer(str, " ");
while (spaceTokenizer.hasMoreTokens()) {
StringTokenizer colonTokenizer = new StringTokenizer(spaceTokenizer.nextToken(), ":");
colonTokenizer.nextToken();//to igore Bob and Carl
while (colonTokenizer.hasMoreTokens()) {
System.out.println(colonTokenizer.nextToken());
}
}
outputs
23456
12345
09876
54321
Personally though I would not use tokenizer here and use Claudio's answer which splits the strings.

String split() over white spaces and "(" and ")"

I have a String
String testString = "IN NEWYORK AND (OUT FLORIDA)" ;
I want to split out this string in array Like :
String testArray[] = testString.split("\\s()");
I would like the result to be:
testArray[0] = "IN";
testArray[1] = "NEWYORK";
testArray[2] = "AND";
testArray[3] = "(";
testArray[4] = "OUT";
testArray[5] = "FLORIDA";
testArray[6] = ")";
However, the output I get is:
testArray[0] = "IN";
testArray[1] = "NEWYORK";
testArray[2] = "AND";
testArray[3] = "(OUT";
testArray[4] = "FLORIDA)";
It is splitting on white spaces but not on "(" and ")" , I want "(" and ")" to be as seperate strings .

Try the below:
String testArray[] = testString.split("\\s|(?<=\\()|(?=\\))");

split() requires a deleimeter to remove. Use StringTokenizer and instruct it to keep the delimiters.
StringTokenizer st = new StringTokenizer("IN NEWYORK AND (OUT FLORIDA)", " ()", true);
while (st.hasMoreTokens()) {
String t = st.nextToken();
if (!t.trim().equals("")) {
System.out.println(t);
}
}

If you want to do it with string split, then monstrous regexes like \s+|((?<=\()|(?=\())|((?<=\))|(?=\))) are pretty much inevitable. This regex is based on this question, btw, and it almost works.
Easiest way is to either surround parentheses with spaces as suggested by #acerisara or use StringTokenizer as suggested by #user1030723

String test = "IN NEWYORK AND (OUT FLORIDA)";
// this can for sure be done better, hope you get the idea
String a = test.replaceAll("(", "( ");
String b = a.replaceAll(")", " )";
String array[] = b.split("\\s");

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

remove chunks of word from string - java

You can use a for loop instead of doing this line by line. String theWord = "getenforce"; st = st.replace(theWord, ""); //check all the sequences in loop for(int i=0; i<theWord.length()-3;i++){ st=st.replace(theWord.subSequence(i, i+4), ""); }

I believe this will resolve your query List<String> strings = Arrays.asList("your sentence with word gete".split(" ")); List<String> filtered = strings.stream().filter(s1 -> !s1.contains("gete")).collect(Collectors.toList());

Related

Replace a set of substring in a string in more efficient way?

String Tokenizer missing 2 values off array

Android: split a string considering 2 separating characters

Cut ':' && " " from a String with a tokenizer

String split() over white spaces and "(" and ")"

Categories

Resources