Replacing Only Certain White Spaces In a String - java
I have string queryInputNameString that is equal to fir, spotted owl and I'm trying to use replaceAll() to remove the white spaces and split() to separate the elements in the inputNameArray array when a comma occurs.
String noSpaces = queryInputNameString.replaceAll("\\s+","");
String[] inputNameArray = noSpaces.split("\\,");
So far the above returns:
fir
spottedowl
but I would like it to only remove the white spaces that occurs immediately before or after a comma and return this:
fir
spotted owl
How can I make my code ignore white spaces that are not preceded/followed by a comma?
Thanks.
Since split() accepts a regex as argument, you can directly do this:
String[] inputNameArray = queryInputNameString.split("\\s*\\,\\s*");
Otherwise, if you really want to replace only spaces after a comma, you can use:
String noSpaces = queryInputNameString.replaceAll(",\\s+",",");
You actually do not have to use more sophisticated regex. If you just split by comma first and then trim each array element you will get the desired result.
This approach might prove to be less effective when dealing with a lot of data.
String[] inputArray = queryInputNameString.split(",");
for (int i=0; i < inputArray.length, ++i) {
inputArray[i] = inputArray[i].trim();
}
Related
String.replace() not replacing all occurrences
I have a very long string which looks similar to this. 355,356,357,358,359,360,361,382,363,364,365,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,368,369,313,370,371,372,373,374,375,376,377,378,379,380,381,382,382,382,382,382,382,383,384,385,380,381,382,382,382,382,382,386,387,388,389,380,381,382,382,382,382,382,382,390,391,380,381,382,382,382,382,382,392,393,394,395,396,397,398,399,.... When I tried using the following code to remove the number 382 from the string. String str = "355,356,357,358,359,360,361,382,363,364,365,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,368,369,313,370,371,372,373,374,375,376,377,378,379,380,381,382,382,382,382,382,382,383,384,385,380,381,382,382,382,382,382,386,387,388,389,380,381,382,382,382,382,382,382,390,391,380,381,382,382,382,382,382,392,393,394,395,396,397,398,399,...." str = str.replace(",382,", ","); But it seems that not all occurrences are being replaced. The string which originally had above 3000 occurrences still was left with about 630 occurrences after replacing. Is the capability of String.replace() limited? If so, is there a possible way of achieving what I need?
You need to replace the trailing comma as well (if one exists, which it won't if last in the list): str = str.replaceAll("\\b382,?", ""); Note \b word boundary to prevent matching "-,1382,-". The above will convert: 382,111,382,1382,222,382 to: 111,1382,222
I think the issue is your first argument to replace(), in particular the comma (,) before and after 382. If you have "382,382,383", you will only match the inner ",382," and leave the initial one behind. Try: str.replace("382,", ""); Although this will fail to match "382" at the very end as it does not have a comma after it. A full solution might entail two method calls thus: str = str.replace("382", ""); // Remove all instances of 382 str.replaceAll(",,+", ","); // Compress all duplicates, triplicates, etc. of commas This combines the two approaches: str.replaceAll("382,?", ""); // Remove 382 and an optional comma after it. Note: both of the last two approaches leave a trailing comma if 382 is at the end.
try this str = str.replaceAll(",382,", ",");
Firstly, remove the preceding comma in your matching string. Then, remove duplicated commas by replacing commas with a single comma using java regular expression. String input = "355,356,357,358,359,360,361,382,363,364,365,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,368,369,313,370,371,372,373,374,375,376,377,378,379,380,381,382,382,382,382,382,382,383,384,385,380,381,382,382,382,382,382,386,387,388,389,380,381,382,382,382,382,382,382,390,391,380,381,382,382,382,382,382,392,393,394,395,396,397,398,399"; String result = input.replace("382,", ","); // remove the preceding comma String result2 = result.replaceAll("[,]+", ","); // replace duplicate commas System.out.println(result2);
As dave already said, the problem is that your pattern overlaps. In the string "...,382,382,..." there are two occurrences of ",382,": "...,382,382,..." ----- first occurrence ----- second occurrence These two occurrences overlap at the comma, and thus Java can only replace one of them. When finding occurrences, it does not see yet what you replace the pattern with, and thus it doesn't see that new occurrence of ",382," is generated when replacing the first occurrence is replaced by the comma. If your data is known not to contain numbers with more than 3 digits, then you might do: str.replace("382,", ""); and then handle occurrences at the end as a special case. But if your data can contain big numbers, then "...,1382,..." will be replaced by "...,1,..." which probably is not what you want. Here are two solutions that do not have the above problem: First, simply repeat the replacement until no changes occur anymore: String oldString = str; str = str.replace(",382,", ","); while (!str.equals(oldString)) { oldString = str; str = str.replace(",382,", ","); } After that, you will have to handle possible occurrences at the end of the string. Second, if you have Java 8, you can do a little more work yourself and use Java streams: str = Arrays.stream(str.split(",")) .filter(s -> !s.equals("382")) .collect(Collectors.joining(",")); This first splits the string at ",", then filters out all strings which are equal to "382", and then concatenates the remaining strings again with "," in between. (Both code snippets are untested.)
Traditional way: String str = ",abc,null,null,0,0,7,8,9,10,11,12,13,14"; String newStr = "", word = ""; for (int i=0; i<str.length(); i++) { if (str.charAt(i) == ',') { if (word.equals("null") || word.equals("0")) word = ""; newStr += word+","; word = ""; } else { word += str.charAt(i); if (i == str.length()-1) newStr += word; } } System.out.println(newStr); Output: ,abc,,,,,7,8,9,10,11,12,13,14
Trim() is not working
while(rs.next()) { value = rs.getString(1).trim().split(","); mineral.addAll(Arrays.asList(value)); } Here the value of rs.getString(1) is given below. "Dimension Stone, Kankar, River Sand, " this value is trimed using trim() and split using split(",") and assign to the array value. Here my problem is trim() do not trim the spaces in the String. Can anyone suggest the reason for this and solve my problem?
The trim function does not remove intra-sentence spaces, it only removes the whitespace characters at either end of the string. If you want all the strings trimmed then you need to invoke the function for each one. String[] values = rs.getString(1).split(","); for(String value : values) { mineral.add(value.trim()); }
try to split the string like this value = rs.getString(1).trim().split(" *, *");
trim() will just remove leading and trailing spaces and not the spaces within the string. Do you wish to remove space between 'Dimension Stone'?
You have two options to do that, if you wish to use trim() then you can go with Perception answer or you can use replace(""," ") for example using replace(" ","") String[] values = rs.getString(1).replace(" ", "").split(",");
Java regex, delete content to the left of comma
I got a string with a bunch of numbers separated by "," in the following form : 1.2223232323232323,74.00 I want them into a String [], but I only need the number to the right of the comma. (74.00). The list have abouth 10,000 different lines like the one above. Right now I'm using String.split(",") which gives me : System.out.println(String[1]) = 1.2223232323232323 74.00 Why does it not split into two diefferent indexds? I thought it should be like this on split : System.out.println(String[1]) = 1.2223232323232323 System.out.println(String[2]) = 74.00 But, on String[] array = string.split (",") produces one index with both values separated by newline. And I only need 74.00 I assume I need to use a REGEX, which is kind of greek to me. Could someone help me out :)?
If it's in a file: Scanner sc = new Scanner(new File("...")); sc.useDelimiter("(\r?\n)?.*?,"); while (sc.hasNext()) System.out.println(sc.next()); If it's all one giant string, separated by new-lines: String oneGiantString = "1.22,74.00\n1.22,74.00\n1.22,74.00"; Scanner sc = new Scanner(oneGiantString); sc.useDelimiter("(\r?\n)?.*?,"); while (sc.hasNext()) System.out.println(sc.next()); If it's just a single string for each: String line = "1.2223232323232323,74.00"; System.out.println(line.replaceFirst(".*?,", "")); Regex explanation: (\r?\n)? means an optional new-line character. . means a wildcard. .*? means 0 or more wildcards (*? as opposed to just * means non-greedy matching, but this probably doesn't mean much to you). , means, well, ..., a comma. Reference. split for file or single string: String line = "1.2223232323232323,74.00"; String value = line.split(",")[1]; split for one giant string (also needs regex) (but I'd prefer Scanner, it doesn't need all that memory): String line = "1.22,74.00\n1.22,74.00\n1.22,74.00"; String[] array = line.split("(\r?\n)?.*?,"); for (int i = 1; i < array.length; i++) // the first element is empty System.out.println(array[i]);
Just try with: String[] parts = "1.2223232323232323,74.00".split(","); String value = parts[1]; // your 74.00
String[] strings = "1.2223232323232323,74.00".split(",");
Java: String splitting into multiple elements
I am having a difficult time figuring out how to split a string like the one following: String str = "hi=bye,hello,goodbye,pickle,noodle This string was read from a text file and I need to split the string into each element between the commas. So I would need to split each element into their own string no matter what the text file reads. Keep in mind, each element could be any length and there could be any amount of elements which 'hi' is equal to. Any ideas? Thanks!
use split! String[] set=str.split(","); then access each string as you need from set[...] (so lets say you want the 3rd string, you would say: set[2]). As a test, you can print them all out: for(int i=0; i<set.length;i++){ System.out.println(set[i]); }
If you need a bit more advanced approach, I suggest guava's Splitter class: Iterable<String> split = Splitter.on(',') .omitEmptyStrings() .trimResults() .split(" bye,hello,goodbye,, , pickle, noodle "); This will get rid of leading or trailing whitespaces and omit blank matches. The class has some more cool stuff in it like splitting your String into key/value pairs.
str = str.subString(indexOf('=')+1); // remove "hi=" part String[] set=str.split(",");
I'm wondering: Do you mean to split it as such: "hi=bye" "hi=hello" "hi=goodbye" "hi=pickle" "hi=noodle" Because a simple split(",") will not do this. What's the purpose of having "hi=" in your given string? Probably, if you mean to chop hi= from the front of the string, do this instead: String input = "hi=bye,hello,goodbye,pickle,noodle"; String hi[] = input.split(","); hi[0] = (hi[0].split("="))[1]; for (String item : hi) { System.out.println(item); }
How do I fill a new array with split pieces from an existing one? (Java)
I'm trying to split paragraphs of information from an array into a new one which is broken into individual words. I know that I need to use the String[] split(String regex), but I can't get this to output right. What am I doing wrong? (assume that sentences[i] is the existing array) String phrase = sentences[i]; String[] sentencesArray = phrase.split(""); System.out.println(sentencesArray[i]); Thanks!
It might be just the console output going wrong. Try replacing the last line by System.out.println(java.util.Arrays.toString(sentencesArray)); The empty-string argument to phrase.split("") is suspect too. Try passing a word boundary: phrase.split("\\b");
You are using an empty expression for splitting, try phrase.split(" ") and work from there.
This does nothing useful: String[] sentencesArray = phrase.split(""); you're splitting on empty string and it will return an array of the individual characters in the string, starting with an empty string. It's hard to tell from your question/code what you're trying to do but if you want to split on words you need something like: private static final Pattern SPC = Pattern.compile("\\s+"); . . String[] words = SPC.split(phrase); The regex will split on one or more spaces which is probably what you want.
String[] sentencesArray = phrase.split(""); The regex based on which the phrase needs to be split up is nothing here. If you wish to split it based on a space character, use: String[] sentencesArray = phrase.split(" "); // ^ Give this space