Split numbers from String array in java - java

String Equation = input.nextLine();
String[] number = Equation.split("\d+");
I want to split all digits come into string and dump into number. How'd do it?
like Equation is : 2x^4 - 45y^4
it should be dumped in number on index as : {2, 4 , 45, 4};

You can split on one or more non-digit characters - \\D+:
String[] number = equation.split("\\D+");
While working with Java regex, you need to double escape the \d, \D, so on. And please follow Java naming convention. Your variable should be named equation, not Equation.

What I'd do is String.replaceAll all non-digits with whitespace. Then String.split by whitespace.
package com.sandbox;
import java.util.Arrays;
public class Sandbox {
public static void main(String[] args) {
String input = "2x^4 - 45y^4";
input = input.replaceAll("\\D", " ");
String[] parts = input.split("\\W+");
System.out.println(Arrays.toString(parts));
}
}
This will print "[2, 4, 45, 4]"
Now that I understand #RohitJain's answer, it seems I'm including an unnecessary step. I guess I'll leave this here anyway since it does work, but I recommend his solution. His solution splits on all non digits. Since split excludes the delimiter, this also removes the non-digits.

Related

How can I split a string without knowing the split characters a-priori?

For my project I have to read various input graphs. Unfortunately, the input edges have not the same format. Some of them are comma-separated, others are tab-separated, etc. For example:
File 1:
123,45
67,89
...
File 2
123 45
67 89
...
Rather than handling each case separately, I would like to automatically detect the split characters. Currently I have developed the following solution:
String str = "123,45";
String splitChars = "";
for(int i=0; i < str.length(); i++) {
if(!Character.isDigit(str.charAt(i))) {
splitChars += str.charAt(i);
}
}
String[] endpoints = str.split(splitChars);
Basically I pick the first row and select all the non-numeric characters, then I use the generated substring as split characters. Is there a cleaner way to perform this?
Split requires a regexp, so your code would fail for many reasons: If the separator has meaning in regexp (say, +), it'll fail. If there is more than 1 non-digit character, your code will also fail. If you code contains more than exactly 2 numbers, it will also fail. Imagine it contains hello, world - then your splitChars string becomes " , " - and your split would do nothing (that would split the string "test , abc" into two, nothing else).
Why not make a regexp to fetch digits, and then find all sequences of digits, instead of focussing on the separators?
You're using regexps whether you want to or not, so let's make it official and use Pattern, while we are at it.
private static final Pattern ALL_DIGITS = Pattern.compile("\\d+");
// then in your split method..
Matcher m = ALL_DIGITS.matcher(str);
List<Integer> numbers = new ArrayList<Integer>();
// dont use arrays, generally. List is better.
while (m.find()) {
numbers.add(Integer.parseInt(m.group(0)));
}
//d+ is: Any number of digits.
m.find() finds the next match (so, the next block of digits), returning false if there aren't any more.
m.group(0) retrieves the entire matched string.
Split the string on \\D+ which means one or more non-digit characters.
Demo:
import java.util.Arrays;
public class Main {
public static void main(String[] args) {
// Test strings
String[] arr = { "123,45", "67,89", "125 89", "678 129" };
for (String s : arr) {
System.out.println(Arrays.toString(s.split("\\D+")));
}
}
}
Output:
[123, 45]
[67, 89]
[125, 89]
[678, 129]
Why not split with [^\d]+ (every group of nondigfit) :
for (String n : "123,456 789".split("[^\\d]+")) {
System.out.println(n);
}
Result:
123
456
789

Need to split character numbers with comma and space using Java

Hi I am relatively new to Java. I have to compare amount value AED 555,439,972 /yr is lesser to another amount. so I tried to split using the code first
public static void main(String[] args) {
String value= "AED 555,439,972 /yr";
String[] tokens = value.split("\b");
int[] numbers = new int[tokens.length];
for (int i = 0; i < tokens.length; i++) {
numbers[i] = Integer.parseInt(tokens[i]);
}
System.out.println(numbers);
}
but I am getting Exception in thread "main" java.lang.NumberFormatException: For input string: "AED 555,439,972 /yr".
Appreciate if someone can help me to solve the problem.
Hope that you need to get the numeric value from the string.
First, use the following to remove all non-digit characters.
value.replaceAll("\\D", "")
\\D stands for non-digit character. Once every such character is replaced with empty string (which means those are removed), use Integer.parseInt on it. (Use Long.parseLong if the values can be out of Integer's range.)
In your code, you are trying to split the string by word character ends (which too is not done correctly; you need to escape it as \\b). That would give you an array having the result of the string split at each word end (after the AED, after the space following AED, after the first 3 digits, after the first comma and so on..), after which you are converting each of the resulting array components into integers, which would fail at the AED.
In short, the following is what you want:
Integer.parseInt(value.replaceAll("\\D", ""));
There are a few of things wrong with your code:
String[] tokens = value.split("\b");
The "\" needs to be escape, like this:
String[] tokens = value.split("\\b");
This will split your input on word boundaries. Only some of the elements in the tokens array will be valid numbers, the others will result in a NumberFormatException. More specifically, at index 2 you'll have "555", at index 4 you'll have 439, and at index 6 you'll have 972. These can be parsed to integers, the others cannot.
I found a solution from stack overflow itself
public static void main(String[] args) {
String line = "AED 555,439,972 /yr";
String digits = line.replaceAll("[^0-9]", "");
System.out.println(digits);
}
the output is 555439972
You are going about it the wrong way. It's a single formatted number, so treat it that way.
Remove all non-digit characters, then parse as an integer:
int amount = Integer.parseInt(value.replaceAll("\\D", ""));
Then you'll have the number of dirhams per year, which you can compare to other values.

Not getting expected output from java String split method

I have a string say, "1.0+20*30.2-4.0/10.1" which I want to split in such a way that I will have a string array say
strArr = {"1.0", "20", "30.2", "4.0",10.1}
I wrote the following code for this
public class StringSplitDemo {
public static void main(String[] args) {
String str = "1.0+20*30.2-4.0/10.1";
String[] strArr = (str.split("[\\+-/\\*]"));
for(int i=0;i<strArr.length;i++)
System.out.print(strArr[i]+" ");
}
}
Rather than printing the expected output(by me) i.e 1.0 20 30.2 4.0 10.1 it prints
output: 1 0 20 30 2 4 0 10 1
which seems to split the string also around "." even if I didn't include it in the regex pattern.
What I'm missing here?
What is the right way to do this?
Use
String str = "1.0+20*30.2-4.0/10.1";
String[] strArr = str.split("[-+/*]");
System.out.print(Arrays.toString(strArr));
See the online Java demo
The [\\+-/\\*] character class matches more than just the 4 chars you defined, as - created a range between + and /.
You could fix the regex by escaping the hyphen, but the pattern looks much cleaner when you put the - at the start (or end) of the character class, where you do not have to escape it as there, the hyphen is treated as a literal -.
The issue was in regex
So you need to escape + otherwise it will treat it as atleast once
String[] strArr = (str.split("[\\-/\\*\\+]"));
By the way escape symbol here is not required. It can simply be written as
String[] strArr = (str.split("[-/*+]"));

How to write a regex to split a String in this format?

I want to use [,.!?;~] to split a string, but I want to remain the [,.!?;~] to its place for example:
This is the example, but it is not enough
To
[This is the example,, but it is not enough] // length=2
[0]=This is the example,
[1]=but it is not enough
As you can see the comma is still in its place. I did this with this regex (?<=([,.!?;~])+). But I want if some special word (e.g: but) comes after the [,.!?;~], then do not split that part of string. For example:
I want this sentence to be split into this form, but how to do. So if
anyone can help, that will be great
To
[0]=I want this sentence to be split into this form, but how to do.
[1]=So if anyone can help,
[2]=that will be great
As you can see this part (form, but) is not split int the first sentence.
I've used:
Positive Lookbehind (?<=a)b to keep the delimiter.
Negative Lookahead a(?!b) to rule out stop words.
Notice how I've appended RegEx (?!\\s*(but|and|if)) after your provided RegEx. You can put all those stop words that you've to rule out (eg, but, and, if) inside the bracket separated by pipe symbol.
Also do notice that the delimiter is still in it's place.
Output
Count of tokens = 3
I want this sentence to be split into this form, but how to do.
So if anyone can help,
that will be great
Code
import java.lang.*;
public class HelloWorld {
public static void main(String[] args) {
String str = "I want this sentence to be split into this form, but how to do. So if anyone can help, that will be great";
//String delimiters = "\\s+|,\\s*|\\.\\s*";
String delimiters = "(?<=,)";
// analyzing the string
String[] tokensVal = str.split("(?<=([,.!?;~])+)(?!\\s*(but|and|if))");
// prints the number of tokens
System.out.println("Count of tokens = " + tokensVal.length);
for (String token: tokensVal) {
System.out.println(token);
}
}
}

Vowel regexp in jflex

So I did an exercise using jflex, which is about counting the amount of words from an input text file that contains more than 3 vowels. What I end up doing was defining a token for word, and then creating a java function that receives this text as input, and check each character. If its a vowel I add up the counter and then I check if its greater than 3, if it is I add up the counter of the amount of words.
What I want to know, if there is a regexp that could match a word with more than 3 vowels. I think it would be a cleaner solution. Thanks in advance.
tokens
Letra = [a-zA-Z]
Palabra = {Letra}+
Very simple. Use this if you want to check that a word contains at least 3 vowels.
(?i)(?:[a-z]*[aeiou]){3}[a-z]*
You only care it that contains at least 3 vowels, so the rest can be any alphabetical characters. The regex above can work in both String.matches and Matcher loop, since the valid word (contains at least 3 vowels) cannot be substring of an invalid word (contains less than 3 vowels).
Out of the question, but for consonant, you can use character class intersection, which is a unique feature to Java regex [a-z&&[^aeiou]]. So if you want to check for exactly 3 vowels (for String.matches):
(?i)(?:[a-z&&[^aeiou]]*[aeiou]){3}[a-z&&[^aeiou]]*
If you are using this in Matcher loop:
(?i)(?<![a-z])(?:[a-z&&[^aeiou]]*[aeiou]){3}[a-z&&[^aeiou]]*(?![a-z])
Note that I have to use look-around to make sure that the string matched (exactly 3 vowels) is not part of an invalid string (possible when it has more than 3 vowels).
Since you yourself wrote a Java method, this can be done as follows in the same:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class VowelChecker {
private static final Pattern vowelRegex = Pattern.compile("[aeiouAEIOU]");
public static void main(String[] args) {
System.out.println(checkVowelCount("aeiou", 3));
System.out.println(checkVowelCount("AEIWW", 3));
System.out.println(checkVowelCount("HeLlO", 3));
}
private static boolean checkVowelCount(String str, int threshold) {
Matcher matcher = vowelRegex.matcher(str);
int count = 0;
while (matcher.find()) {
if (++count > threshold) {
return true;
}
}
return false;
}
}
Here threshold defines the number of vowels you are looking for (since you are looking for greater than 3, hence 3 in the main method). The output is as follows:
true
false
false
Hope this helps!
Thanks,
EG
I ended up using this regexp I came up. If anyone has a better feel free to post
Cons = [bcdBCDfghFGHjklmnJKLMNpqrstPQRSTvwxyzVWXYZ]
Vocal = [aeiouAEIOU]
Match = {Cons}*{Vocal}{Cons}*{Vocal}{Cons}*{Vocal}{Cons}*{Vocal}({Cons}*{Vocal}*|{Vocal}*{Cons}*) | {Vocal}{Cons}*{Vocal}{Cons}*{Vocal}{Cons}*{Vocal}({Cons}*{Vocal}*|{Vocal}*{Cons}*)

Categories