Java regex, delete content to the left of comma - java

I got a string with a bunch of numbers separated by "," in the following form :
1.2223232323232323,74.00
I want them into a String [], but I only need the number to the right of the comma. (74.00). The list have abouth 10,000 different lines like the one above. Right now I'm using String.split(",") which gives me :
System.out.println(String[1]) =
1.2223232323232323
74.00
Why does it not split into two diefferent indexds? I thought it should be like this on split :
System.out.println(String[1]) = 1.2223232323232323
System.out.println(String[2]) = 74.00
But, on String[] array = string.split (",") produces one index with both values separated by newline.
And I only need 74.00 I assume I need to use a REGEX, which is kind of greek to me. Could someone help me out :)?

If it's in a file:
Scanner sc = new Scanner(new File("..."));
sc.useDelimiter("(\r?\n)?.*?,");
while (sc.hasNext())
System.out.println(sc.next());
If it's all one giant string, separated by new-lines:
String oneGiantString = "1.22,74.00\n1.22,74.00\n1.22,74.00";
Scanner sc = new Scanner(oneGiantString);
sc.useDelimiter("(\r?\n)?.*?,");
while (sc.hasNext())
System.out.println(sc.next());
If it's just a single string for each:
String line = "1.2223232323232323,74.00";
System.out.println(line.replaceFirst(".*?,", ""));
Regex explanation:
(\r?\n)? means an optional new-line character.
. means a wildcard.
.*? means 0 or more wildcards (*? as opposed to just * means non-greedy matching, but this probably doesn't mean much to you).
, means, well, ..., a comma.
Reference.
split for file or single string:
String line = "1.2223232323232323,74.00";
String value = line.split(",")[1];
split for one giant string (also needs regex) (but I'd prefer Scanner, it doesn't need all that memory):
String line = "1.22,74.00\n1.22,74.00\n1.22,74.00";
String[] array = line.split("(\r?\n)?.*?,");
for (int i = 1; i < array.length; i++) // the first element is empty
System.out.println(array[i]);

Just try with:
String[] parts = "1.2223232323232323,74.00".split(",");
String value = parts[1]; // your 74.00

String[] strings = "1.2223232323232323,74.00".split(",");

Related

How to return only first n number of words in a sentence Java

Say i have a simple sentence as below.
For example, this is what have:
A simple sentence consists of only one clause. A compound sentence
consists of two or more independent clauses. A complex sentence has at
least one independent clause plus at least one dependent clause. A set
of words with no independent clause may be an incomplete sentence,
also called a sentence fragment.
I want only first 10 words in the sentence above.
I'm trying to produce the following string:
A simple sentence consists of only one clause. A compound
I tried this:
bigString.split(" " ,10).toString()
But it returns the same bigString wrapped with [] array.
Thanks in advance.
Assume bigString : String equals your text. First thing you want to do is split the string in single words.
String[] words = bigString.split(" ");
How many words do you like to extract?
int n = 10;
Put words together
String newString = "";
for (int i = 0; i < n; i++) { newString = newString + " " + words[i];}
System.out.println(newString);
Hope this is what you needed.
If you want to know more about regular expressions (i.e. to tell java where to split), see here: How to split a string in Java
If you use the split-Method with a limiter (yours is 10) it won't just give you the first 10 parts and stop but give you the first 9 parts and the 10th place of the array contains the rest of the input String. ToString concatenates all Strings from the array resulting in the whole input String. What you can do to achieve what you initially wanted is:
String[] myArray = bigString.split(" " ,11);
myArray[10] = ""; //setting the rest to an empty String
myArray.toString(); //This should give you now what you wanted but surrouned with array so just cut that off iterating the array instead of toString or something.
This will help you
String[] strings = Arrays.stream(bigstring.split(" "))
.limit(10)
.toArray(String[]::new);
Here is exactly what you want:
String[] result = new String[10];
// regex \s matches a whitespace character: [ \t\n\x0B\f\r]
String[] raw = bigString.split("\\s", 11);
// the last entry of raw array is the whole sentence, need to be trimmed.
System.arraycopy(raw, 0, result , 0, 10);
System.out.println(Arrays.toString(result));

Replacing Only Certain White Spaces In a String

I have string queryInputNameString that is equal to fir, spotted owl and I'm trying to use replaceAll() to remove the white spaces and split() to separate the elements in the inputNameArray array when a comma occurs.
String noSpaces = queryInputNameString.replaceAll("\\s+","");
String[] inputNameArray = noSpaces.split("\\,");
So far the above returns:
fir
spottedowl
but I would like it to only remove the white spaces that occurs immediately before or after a comma and return this:
fir
spotted owl
How can I make my code ignore white spaces that are not preceded/followed by a comma?
Thanks.
Since split() accepts a regex as argument, you can directly do this:
String[] inputNameArray = queryInputNameString.split("\\s*\\,\\s*");
Otherwise, if you really want to replace only spaces after a comma, you can use:
String noSpaces = queryInputNameString.replaceAll(",\\s+",",");
You actually do not have to use more sophisticated regex. If you just split by comma first and then trim each array element you will get the desired result.
This approach might prove to be less effective when dealing with a lot of data.
String[] inputArray = queryInputNameString.split(",");
for (int i=0; i < inputArray.length, ++i) {
inputArray[i] = inputArray[i].trim();
}

Scanner through a line with whitespace and comma

I am new to Java and looking for some help with Java's Scanner class. Below is the problem.
I have a text file with multiple lines and each line having multiple pairs of digit.Such that each pair of digit is represented as ( digit,digit ). For example 3,3 6,4 7,9. All these multiple pairs of digits are seperated from each other by a whitespace. Below is an exampel from the text file.
1 2,3 3,2 4,5
2 1,3 4,2 6,13
3 1,2 4,2 5,5
What i want is that i can retrieve each digit seperately. So that i can create an array of linkedlist out it. Below is what i have acheived so far.
Scanner sc = new Scanner(new File("a.txt"));
Scanner lineSc;
String line;
Integer vertix = 0;
Integer length = 0;
sc.useDelimiter("\\n"); // For line feeds
while (sc.hasNextLine()) {
line = sc.nextLine();
lineSc = new Scanner(line);
lineSc.useDelimiter("\\s"); // For Whitespace
// What should i do here. How should i scan through considering the whitespace and comma
}
Thanks
Consider using a regular expression, and data that doesn't conform to your expectation will be easily identified and dealt with.
CharSequence inputStr = "2 1,3 4,2 6,13";
String patternStr = "(\\d)\\s+(\\d),";
// Compile and use regular expression
Pattern pattern = Pattern.compile(patternStr);
Matcher matcher = pattern.matcher(inputStr);
while (matcher.find()) {
// Get all groups for this match
for (int i=0; i<=matcher.groupCount(); i++) {
String groupStr = matcher.group(i);
}
}
Group one and group two will correspond to the first and second digit in each pairing, respectively.
1. use nextLine() method of Scanner to get the each Entire line of text from the File.
2. Then use BreakIterator class with its static method getCharacterInstance(), to get the individual character, it will automatically handle commas, spaces, etc.
3. BreakIterator also give you many flexible methods to separate out the sentences, words etc.
For more details see this:
http://docs.oracle.com/javase/6/docs/api/java/text/BreakIterator.html
Use the StringTokenizer class. http://docs.oracle.com/javase/1.4.2/docs/api/java/util/StringTokenizer.html
//this is in the while loop
//read each line
String line=sc.nextLine();
//create StringTokenizer, parsing with space and comma
StringTokenizer st1 = new StringTokenizer(line," ,");
Then each digit is read as a string when you call nextToken() like this, if you wanted all digits in the line
while(st1.hasMoreTokens())
{
String temp=st1.nextToken();
//now if you want it as an integer
int digit=Integer.parseInt(temp);
//now you have the digit! insert it into the linkedlist or wherever you want
}
Hope this helps!
Use split(regex), more simple :
while (sc.hasNextLine()) {
final String[] line = sc.nextLine().split(" |,");
// What should i do here. How should i scan through considering the whitespace and comma
for(int num : line) {
// Do your job
}
}

How do I fill a new array with split pieces from an existing one? (Java)

I'm trying to split paragraphs of information from an array into a new one which is broken into individual words. I know that I need to use the String[] split(String regex), but I can't get this to output right.
What am I doing wrong?
(assume that sentences[i] is the existing array)
String phrase = sentences[i];
String[] sentencesArray = phrase.split("");
System.out.println(sentencesArray[i]);
Thanks!
It might be just the console output going wrong. Try replacing the last line by
System.out.println(java.util.Arrays.toString(sentencesArray));
The empty-string argument to phrase.split("") is suspect too. Try passing a word boundary:
phrase.split("\\b");
You are using an empty expression for splitting, try phrase.split(" ") and work from there.
This does nothing useful:
String[] sentencesArray = phrase.split("");
you're splitting on empty string and it will return an array of the individual characters in the string, starting with an empty string.
It's hard to tell from your question/code what you're trying to do but if you want to split on words you need something like:
private static final Pattern SPC = Pattern.compile("\\s+");
.
.
String[] words = SPC.split(phrase);
The regex will split on one or more spaces which is probably what you want.
String[] sentencesArray = phrase.split("");
The regex based on which the phrase needs to be split up is nothing here. If you wish to split it based on a space character, use:
String[] sentencesArray = phrase.split(" ");
// ^ Give this space

Splitting and assigning a string with whitespace as the delimeter

I need help splitting this string, but i can't seem to come with the right way of doing it.
Suppose I have two numbers on a line
12 101
I would like to take the first and assign it to variable, and then take the second and assign it to a variable, this may sounds easy, but for me i can't come up with the right way to do it?
Split the string on space which will give you an array of strings that can be stored in two variables. If necessary, you can convert them to ints as shown below:
String text = "12 101";
String[] split= text.split("\\s+");
String first = split[0];
String second = split[1];
//if you want them as ints
int firstNum = Integer.parseInt(first);
int secondNum = Integer.parseInt(second);
String[] result = myString.split(" ");
should work. Then you can assign the values to a variable from the array if you want. Though you should note that if there are two spaces, it will create an array with length of three, the middle element being an empty string.
String s = "12 101";
String[] splitted = s.split("\s"); // \s = any whitespace character except newline
String[] myStringArray = myString.split(" ");
will split the string into an array myStringArray
String text = "12 101";
String[] splitted = text.split("\\s+");
System.out.println(splitted[0]);
System.out.println(splitted[1]);
Will print:
12
101
Using \s+ splits the string at every whitespace in your text. Multiple whitespaces are ignores.
Only doing split(" "); will result in empty fields (e.g. "102 12 4" => [102, , 12, 4]).

Categories