String split() over white spaces and "(" and ")" - java

I have a String
String testString = "IN NEWYORK AND (OUT FLORIDA)" ;
I want to split out this string in array Like :
String testArray[] = testString.split("\\s()");
I would like the result to be:
testArray[0] = "IN";
testArray[1] = "NEWYORK";
testArray[2] = "AND";
testArray[3] = "(";
testArray[4] = "OUT";
testArray[5] = "FLORIDA";
testArray[6] = ")";
However, the output I get is:
testArray[0] = "IN";
testArray[1] = "NEWYORK";
testArray[2] = "AND";
testArray[3] = "(OUT";
testArray[4] = "FLORIDA)";
It is splitting on white spaces but not on "(" and ")" , I want "(" and ")" to be as seperate strings .

Try the below:
String testArray[] = testString.split("\\s|(?<=\\()|(?=\\))");

split() requires a deleimeter to remove. Use StringTokenizer and instruct it to keep the delimiters.
StringTokenizer st = new StringTokenizer("IN NEWYORK AND (OUT FLORIDA)", " ()", true);
while (st.hasMoreTokens()) {
String t = st.nextToken();
if (!t.trim().equals("")) {
System.out.println(t);
}
}

If you want to do it with string split, then monstrous regexes like \s+|((?<=\()|(?=\())|((?<=\))|(?=\))) are pretty much inevitable. This regex is based on this question, btw, and it almost works.
Easiest way is to either surround parentheses with spaces as suggested by #acerisara or use StringTokenizer as suggested by #user1030723

String test = "IN NEWYORK AND (OUT FLORIDA)";
// this can for sure be done better, hope you get the idea
String a = test.replaceAll("(", "( ");
String b = a.replaceAll(")", " )";
String array[] = b.split("\\s");

Related

Replace the words in String without using String replace

Is there any solution on how to replace words in string without using String replace?
As you all can see this is like hard coded it. Is there any method to make it dynamically? I heard that there is some library file able to make it dynamically but I am not very sure.
Any expert out there able to give me some solutions? Thank you so much and have a nice day.
for (int i = 0; i < results.size(); ++i) {
// To remove the unwanted words in the query
test = results.toString();
String testresults = test.replace("numFound=2,start=0,docs=[","");
testresults = testresults.replace("numFound=1,start=0,docs=[","");
testresults = testresults.replace("{","");
testresults = testresults.replace("SolrDocument","");
testresults = testresults.replace("numFound=4,start=0,docs=[","");
testresults = testresults.replace("SolrDocument{", "");
testresults = testresults.replace("content=[", "");
testresults = testresults.replace("id=", "");
testresults = testresults.replace("]}]}", "");
testresults = testresults.replace("]}", "");
testresults = testresults.replace("}", "");
In this case, you will need learn regular expression and a built-in String function String.replaceAll() to capture all possible unwanted words.
For example:
test.replaceAll("SolrDocument|id=|content=\\[", "");
Simply create and use a custom String.replace() method which happens to use the String.replace() method within it:
public static String customReplace(String inputString, String replaceWith, String... stringsToReplace) {
if (inputString.equals("")) { return replaceWith; }
if (stringsToReplace.length == 0) { return inputString; }
for (int i = 0; i < stringsToReplace.length; i++) {
inputString = inputString.replace(stringsToReplace[i], replaceWith);
}
return inputString;
}
In the example method above you can supply as many strings as you like to be replaced within the stringsToReplace parameter as long as they are delimited with a comma (,). They will all be replaced with what you supply for the replaceWith parameter.
Here is an example of how it can be used:
String test = "This is a string which contains numFound=2,start=0,docs=[ crap and it may also "
+ "have numFound=1,start=0,docs=[ junk in it along with open curly bracket { and "
+ "the SolrDocument word which might also have ]}]} other crap in there too.";
testResult = customReplace(strg, "", "numFound=2,start=0,docs=[ ", "numFound=1,start=0,docs=[ ",
+ "{ ", "SolrDocument ", "]}]} ");
System.out.println(testResult);
You can also pass a single String Array which contains all your unwanted strings within its elements and pass that array to the stringsToReplace parameter, for example:
String test = "This is a string which contains numFound=2,start=0,docs=[ crap and it may also "
+ "have numFound=1,start=0,docs=[ junk in it along with open curly bracket { and "
+ "the SolrDocument word which might also have ]}]} other crap in there too.";
String[] unwantedStrings = {"numFound=2,start=0,docs=[ ", "numFound=1,start=0,docs=[ ",
"{ ", "SolrDocument ", "]}]} "};
String testResult = customReplace(test, "", unwantedStrings);
System.out.println(testResult);

regex to match and replace two characters between string

I have a string String a = "(3e4+2e2)sin(30)"; and i want to show it as a = "(3e4+2e2)*sin(30)";
I am not able to write a regular expression for this.
Try this replaceAll:
a = a.replaceAll("\) *(\\w+)", ")*$1");
You can go with this
String func = "sin";// or any function you want like cos.
String a = "(3e4+2e2)sin(30)";
a = a.replaceAll("[)]" + func, ")*siz");
System.out.println(a);
this should work
a = a.replaceAll("\\)(\\s)*([^*+/-])", ") * $2");
String input = "(3e4+2e2)sin(30)".replaceAll("(\\(.+?\\))(.+)", "$1*$2"); //(3e4+2e2)*sin(30)
Assuming the characters within the first parenthesis will always be in similar pattern, you can split this string into two at the position where you would like to insert the character and then form the final string by appending the first half of the string, new character and second half of the string.
string a = "(3e4+2e2)sin(30)";
string[] splitArray1 = Regex.Split(a, #"^\(\w+[+]\w+\)");
string[] splitArray2 = Regex.Split(a, #"\w+\([0-9]+\)$");
string updatedInput = splitArray2[0] + "*" + splitArray1[1];
Console.WriteLine("Input = {0} Output = {1}", a, updatedInput);
I did not try but the following should work
String a = "(3e4+2e2)sin(30)";
a = a.replaceAll("[)](\\w+)", ")*$1");
System.out.println(a);

Cut ':' && " " from a String with a tokenizer

right now I am a little bit confused. I want to manipulate this string with a tokenizer:
Bob:23456:12345 Carl:09876:54321
However, I use a Tokenizer, but when I try:
String signature1 = tok.nextToken(":");
tok.nextToken(" ")
I get:
12345 Carl
However I want to have the first int and the second int into a var.
Any ideas?
You have two different patterns, maybe you should handle both separated.
Fist you should split the space separated values. Only use the string split(" "). That will return a String[].
Then for each String use tokenizer.
I believe will works.
Code:
String input = "Bob:23456:12345 Carl:09876:54321";
String[] words = input.split(" ")
for (String word : words) {
String[] token = each.split(":");
String name = token[0];
int value0 = Integer.parseInt(token[1]);
int value1 = Integer.parseInt(token[2]);
}
Following code should do:
String input = "Bob:23456:12345 Carl:09876:54321";
StringTokenizer st = new StringTokenizer(input, ": ");
while(st.hasMoreTokens())
{
String name = st.nextToken();
String val1 = st.nextToken();
String val2 = st.nextToken();
}
Seeing as you have multiple patterns, you cannot handle them with only one tokenizer.
You need to first split it based on whitespace, then split based on the colon.
Something like this should help:
String[] s = "Bob:23456:12345 Carl:09876:54321".split(" ");
System.out.println(Arrays.toString(s ));
String[] so = s[0].split(":", 2);
System.out.println(Arrays.toString(so));
And you'd get this:
[Bob:23456:12345, Carl:09876:54321]
[Bob, 23456:12345]
If you must use tokeniser then I tink you need to use it twice
String str = "Bob:23456:12345 Carl:09876:54321";
StringTokenizer spaceTokenizer = new StringTokenizer(str, " ");
while (spaceTokenizer.hasMoreTokens()) {
StringTokenizer colonTokenizer = new StringTokenizer(spaceTokenizer.nextToken(), ":");
colonTokenizer.nextToken();//to igore Bob and Carl
while (colonTokenizer.hasMoreTokens()) {
System.out.println(colonTokenizer.nextToken());
}
}
outputs
23456
12345
09876
54321
Personally though I would not use tokenizer here and use Claudio's answer which splits the strings.

Reading Strings from lines in Java

I have a txt file formatted like:
Name 'Paul' 9-years old
How can I get from a "readline":
String the_name="Paul"
and
int the_age=9
in Java, discarding all the rest?
I have:
...
BufferedReader bufferedReader = new BufferedReader(fileReader);
StringBuffer stringBuffer = new StringBuffer();
String line;
while ((line = bufferedReader.readLine()) != null) {
//put the name value in the_name
//put age value in the_age
}
...
Please suggest, thanks.
As you're using BufferedReader and everything is on the one line, you would have to split it to extract the data. Some additional formatting is then required to remove the quotes & extract the year part of age. No need for any fancy regex:
String[] strings = line.split(" ");
if (strings.length >= 3) {
String the_name= strings[1].replace("'", "");
String the_age = strings[2].substring(0, strings[2].indexOf("-"));
}
I notice you have this functionality in a while loop. For this to work, make sure that every line keeps the format:
text 'Name' digit-any other text
^^ ^^ ^
Important chars are
Spaces: min of 3 tokens needed for split array
Single quotes
- Hyphen character
use java.util.regex.Pattern:
Pattern pattern = Pattern.compile("Name '(.*)' (\d*)-years old");
for (String line : lines) {
Matcher matcher = pattern.matcher(line);
if (matcher.matches()) {
String theName = matcher.group(1);
int theAge = Integer.parseInt(matcher.group(2));
}
}
You can use the String.substring, String.indexOf, String.lastIndexOf, and Integer.parseInt methods as follows:
String line = "Name 'Paul' 9-years old";
String theName = line.substring(line.indexOf("'") + 1, line.lastIndexOf("'"));
String ageStr = line.substring(line.lastIndexOf("' ") + 2, line.indexOf("-years"));
int theAge = Integer.parseInt(ageStr);
System.out.println(theName + " " + theAge);
Output:
Paul 9

removing space before new line in java

i have a space before a new line in a string and cant remove it (in java).
I have tried the following but nothing works:
strToFix = strToFix.trim();
strToFix = strToFix.replace(" \n", "");
strToFix = strToFix.replaceAll("\\s\\n", "");
myString.replaceAll("[ \t]+(\r\n?|\n)", "$1");
replaceAll takes a regular expression as an argument. The [ \t] matches one or more spaces or tabs. The (\r\n?|\n) matches a newline and puts the result in $1.
try this:
strToFix = strToFix.replaceAll(" \\n", "\n");
'\' is a special character in regex, you need to escape it use '\'.
I believe with this one you should try this instead:
strToFix = strToFix.replace(" \\n", "\n");
Edit:
I forgot the escape in my original answer. James.Xu in his answer reminded me.
Are you sure?
String s1 = "hi ";
System.out.println("|" + s1.trim() + "|");
String s2 = "hi \n";
System.out.println("|" + s2.trim() + "|");
prints
|hi|
|hi|
are you sure it is a space what you're trying to remove? You should print string bytes and see if the first byte's value is actually a 32 (decimal) or 20 (hexadecimal).
trim() seems to do what your asking on my system. Here's the code I used, maybe you want to try it on your system:
public class so5488527 {
public static void main(String [] args)
{
String testString1 = "abc \n";
String testString2 = "def \n";
String testString3 = "ghi \n";
String testString4 = "jkl \n";
testString3 = testString3.trim();
System.out.println(testString1);
System.out.println(testString2.trim());
System.out.println(testString3);
System.out.println(testString4.trim());
}
}

Categories