Java function to parse all doubles from string - java

I know this has been asked before¹ but responses don't seem to cover all corner cases.
I tried implementing the suggestion¹ with the test case
String("Doubles -1.0, 0, 1, 1.12345 and 2.50")
Which should return
[-1, 0, 1, 1.12345, 2.50]:
import java.util.Scanner;
import java.util.ArrayList;
import java.util.Locale;
public class Main
{
public static void main(String[] args) {
String string = new String("Doubles -1.0, 0, 1, 1.12345 and 2.50");
System.out.println(string);
ArrayList<Double> doubles = getDoublesFromString(string);
System.out.println(doubles);
}
public static ArrayList<Double> getDoublesFromString(String string){
Scanner parser = new Scanner(string);
parser.useLocale(Locale.US);
ArrayList<Double> doubles = new ArrayList<Double>();
double currentDouble;
while (parser.hasNext()){
if(parser.hasNextDouble()){
currentDouble = parser.nextDouble();
doubles.add(currentDouble);
}
else {
parser.next();
}
}
parser.close();
return doubles;
}
}
Instead code above returns [1.12345, 2.5].
Did I implement it wrong? What's the fix for catching negative and 0's?

I would use a regex find all approach here:
String string = new String("Doubles -1.0, 0, 1, 1.12345 and 2.50");
List<String> nums = new ArrayList<>();
String pattern = "-?\\d+(?:\\.\\d+)?";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(string);
while (m.find()) {
nums.add(m.group());
}
System.out.println(nums); // [-1.0, 0, 1, 1.12345, 2.50]
By the way, your question makes use of the String constructor, which is seldom used, but is interesting to see, especially for those of us who never use it.
Here is an explanation of the regex pattern:
-? match an optional leading negative sign
\\d+ match a whole number
(?:\\.\\d+)? match an optional decimal component

For your specific example, adding this at the construction of the scanner is sufficient: parser.useDelimiter("\\s|,");
The problem in your code is that the tokens containing a comma are not recognized as valid doubles. What the code above does is configuring the scanner to consider not only blank characters but also commas as token delimiters, and therefore the comma will not be in the token anymore, hence it will be a valid double that will successfully be parsed.
I believe this is the most appropriate solution because matching all doubles is actually complex. Below, I have pasted the regex that Scanner uses to do that, see how complicated this really is. Compared to splitting the string and then using Double.parseDouble, this is pretty similar but involves less custom code, and more importantly no exception throwing, which is slow.
(([-+]?((((([0-9\p{javaDigit}]))++)|(\p{javaDigit}&&[^0]?(([0-9\p{javaDigit}]))?(\x{2c}(([0-9\p{javaDigit}]))(([0-9\p{javaDigit}]))(([0-9\p{javaDigit}])))+))|(((([0-9\p{javaDigit}]))++)|(\p{javaDigit}&&[^0]?(([0-9\p{javaDigit}]))?(\x{2c}(([0-9\p{javaDigit}]))(([0-9\p{javaDigit}]))(([0-9\p{javaDigit}])))+))\x{2e}(([0-9\p{javaDigit}]))+|\x{2e}(([0-9\p{javaDigit}]))++)([eE][+-]?(([0-9\p{javaDigit}]))+)?)|(((((([0-9\p{javaDigit}]))++)|(\p{javaDigit}&&[^0]?(([0-9\p{javaDigit}]))?(\x{2c}(([0-9\p{javaDigit}]))(([0-9\p{javaDigit}]))(([0-9\p{javaDigit}])))+))|(((([0-9\p{javaDigit}]))++)|(\p{javaDigit}&&[^0]?(([0-9\p{javaDigit}]))?(\x{2c}(([0-9\p{javaDigit}]))(([0-9\p{javaDigit}]))(([0-9\p{javaDigit}])))+))\x{2e}(([0-9\p{javaDigit}]))+|\x{2e}(([0-9\p{javaDigit}]))++)([eE][+-]?(([0-9\p{javaDigit}]))+)?)|(\Q-\E((((([0-9\p{javaDigit}]))++)|(\p{javaDigit}&&[^0]?(([0-9\p{javaDigit}]))?(\x{2c}(([0-9\p{javaDigit}]))(([0-9\p{javaDigit}]))(([0-9\p{javaDigit}])))+))|(((([0-9\p{javaDigit}]))++)|(\p{javaDigit}&&[^0]?(([0-9\p{javaDigit}]))?(\x{2c}(([0-9\p{javaDigit}]))(([0-9\p{javaDigit}]))(([0-9\p{javaDigit}])))+))\x{2e}(([0-9\p{javaDigit}]))+|\x{2e}(([0-9\p{javaDigit}]))++)([eE][+-]?(([0-9\p{javaDigit}]))+)?))|[-+]?0[xX][0-9a-fA-F].[0-9a-fA-F]+([pP][-+]?[0-9]+)?|(([-+]?(NaN|\QNaN\E|Infinity|\Q∞\E))|((NaN|\QNaN\E|Infinity|\Q∞\E))|(\Q-\E(NaN|\QNaN\E|Infinity|\Q∞\E)))

First of all: I would use the regex solution, too… It's better and the following is just an alternative using split and replace/replaceAll while catching Exceptions:
public static void main(String[] args) {
// input
String s = "Doubles -1.0, 0, 1, 1.12345 and 2.50";
// split by whitespace(s) (keep in mind the commas will stay)
String[] parts = s.split("\\s+");
// create a collection to store the Doubles
List<Double> nums = new ArrayList<>();
// stream the result of the split operation and
Arrays.stream(parts).forEach(p -> {
// try to…
try {
// replace all commas and parse the value
nums.add(Double.parseDouble(p.replaceAll(",", "")));
} catch (Exception e) {
// which won't work for words like "Doubles", so print an error on those
System.err.println("Could not parse \"" + p + "\"");
}
});
// finally print all successfully parsed Double values
nums.forEach(System.out::println);
}
Output:
Could not parse "Doubles"
Could not parse "and"
-1.0
0.0
1.0
1.12345
2.5

Related

Extract double from string

I have a string that is formatted like this inputString = "!Circle(1.234)"
How do I extract just the 1.234? I have tried:
double value = Double.parseDouble(inputString.(replaceAll("[^0-9]", "")));
but that would also remove the "."
Edit: Wonder if I have something like inputString = "!Rectangle(1.2,1.3)"
or
input String = "!Triangle(1.1,1.2,1.3)"
What do I need to do to extract the numbers first, before casting them as double?
Exclude dot from your regex.
double value = Double.parseDouble(inputString.replaceAll("[^0-9.]", ""));
Try this.
static Pattern DOUBLE_PATTERN = Pattern.compile("[-+]?[0-9]*\\.?[0-9]+([eE][-+]?[0-9]+)?");
public static double[] extractDoubles(String input) {
return DOUBLE_PATTERN.matcher(input).results()
.mapToDouble(m -> Double.parseDouble(m.group()))
.toArray();
}
public static void main(String[] args) {
String input = "!Triangle(1.1,1.2,1.3)";
System.out.println(Arrays.toString(extractDoubles(input)));
}
output:
[1.1, 1.2, 1.3]
If I understand your problem correctly, your input string is like the following:
!Circle(1.234)
or even
{!Rectangle(1.2,1.3)}
You could "gather" all the numbers in your input string. For that you'd typically need a regular expression.
But I guess you're trying to write something that acts like an interpreter or something. In that case you'd need to write a state machine, that interprets (character for character) the whole input. That a way more complex thing to do.

How do I read txt file if I have multiple attributes?

I am trying to get a class to read my txt file with a few lines, for example:
Facial Lotion, 1 , 2, 0.1
Moisturiser Lotion, 2, 3, 0.2
Toner Lotion, 3, 4, 0.3
Aloe Vera Lotion, 4, 5, 0.4
I created a class call Lotion with attributes name(string), productNo(int), productRating(int), and productDiscount(double, and I create another class call ListOfLotion and add in an arraylist of Lotion.
my problem is how do i get my ListOfLotion class to use the values in txt file and put it in my arraylist.
I tried to use indexOf for name till the next one but i got error,
java.lang.StringIndexOutOfBoundsException: begin 0, end -1, length 17
also is there anyway i could separate all four value and make sure they are store properly for example, Facial Lotion is store as the name and 1 is store as prodcuctNo.
public void addListOfLotion(){
ArrayList<Lotion> lotion = new ArrayList<Lotion>();
Scanner scanner = new Scanner("Desktop/Lotion.txt");
while(scanner.hasNext()){
String readLine = scanner.nextLine();
int indexProductNo = readLine.indexOf(',');
int indexOfProductRating = readLine.indexOf(',');
double indexOfProductDiscount = readLine.indexOf(',');
lotion.add(new Lotion(readLine.substring(0, indexOfProductNo),0,0,0));
}scanner.close();
}
Got this error as result:
java.lang.StringIndexOutOfBoundsException: begin 0, end -1, length 17
at java.base/java.lang.String.checkBoundsBeginEnd(String.java:3319)
at java.base/java.lang.String.substring(String.java:1874)
at ListOfVenues.addListOfLotion(ListOfLotion.java:42)
Is it beccause I put readLine,indexOf(',') as every readLine, it just stop at the first ','? Anyway I could effectively let java know that between this and this index is for name, and between this and this index is for productNo?
thanks guys really appreciate it.
Since the lines are comma-separated lists you could use split() to split the line into the single variables.
Another thing to consider is that Scanner("file.txt") doesn't read the indicated text file but just the given String. You have to create a File object first.
File input = new File("Desktop/Lotion.txt");
Scanner scanner;
scanner = new Scanner(input);
while(scanner.hasNext()){
String readLine = scanner.nextLine();
String[] strArray = readLine.split(",");
int indexOfProductNo = Integer.parseInt(strArray[1].trim());
int indexOfProductRating = Integer.parseInt(strArray[2].trim());
double indexOfProductDiscount = Double.parseDouble(strArray[3].trim());
lotion.add(new Lotion(strArray[0],indexOfProductNo,indexOfProductRating,indexOfProductDiscount));
}
You could use a regex (Demo):
([\w\s]+)\s*,\s*(\d+)\s*,\s*(\d+)\s*,\s*(\d+(?:\.\d+))
Which you could define as a constant in your class:
private static final Pattern LOTION_ENTRY =
Pattern.compile("([\\w\\s]+)\\s*,\\s*(\\d+)\\s*,\\s*(\\d+)\\s*,\\s*(\\d+(?:\\.\\d+))");
Then you can just create a Matcher for every entry and extract the groups:
Matcher matcher = LOTION_ENTRY.matcher(readLine);
if(matcher.matches()) {
String name = matcher.group(1);
int no = Integer.parseInt(matcher.group(2));
int rating = Integer.parseInt(matcher.group(3));
double discount = Double.parseDouble(matcher.group(4));
// do something
} else {
// line doesn't match pattern, throw error or log
}
A note though: the parseInt() and parseDouble can throw a NumberFormatException if the input is not valid. So you'd have to catch those and act accordingly.

How to sort numbers(can be super big) in string in java most efficiently

Given a String like "4, 100, -2147483647, 1" I want to sort it like "-2147483647, 1, 4, 100". So far I have tried splitting the String, and run parseInt on the Strings produced by splitting. However, since parseInt cannot handle numbers out of scope, it throws me NumberFormatException. What would be the most efficient way of handling this? (time, precision...) Thanks!
Converting it to an int will limit your input. If you leave the numbers as strings, then you could have upwards of 2 billion digits.
After that, you can create an algorithm to compare the sizes of the numbers. If they are the same size, go through each number from left to right and find which is the largest. Otherwise, the number with fewer digits is smaller.
Also check for negatives, since you seem to want that checked also.
None of the numbers you provided in your example are out of scope for an integer. The following program would parse your provided string and sort it appropriately...
public static void main(String[] args) {
String s = "4, 100, -2147483647, 1";
String[] strArray = s.split(", ");
ArrayList intList = new ArrayList();
for(int i = 0; i < strArray.length; i++) {
intList.add(Integer.parseInt(strArray[i]));
}
Collections.sort(intList);
System.out.println(intList);
}
The output from this program would be as follows....
[-2147483647, 1, 4, 100]
If you have a number larger than 2,147,483,647 or smaller than -2,147,483,648 then you can use a long or BigInteger like John Kugelman suggested.
First convert each number to String and store them in a string array
Here unsorted is the reference to the unsorted array. Here i have just used a lamda expression to use as a comparator (java8) if you want to try to run for java 7, just use a Comparator with the given conditions and pass it along
Arrays.sort(unsorted, (left, right) -> {
if (left.length() != right.length()) {
return left.length() - right.length();
} else {
return left.compareTo(right);
}
});
return unsorted;
If the numbers don't fit in ints, try longs. As in, parseLong.
It's unlikely, but if you need something even bigger than that, use BigInteger. It can handle anything you throw at it.
Try below code
public static void main(String[] args) {
String s = "4, 100, -2147483647, 1";
String[] strArray = s.split(", ");
List<BigDecimal> intList = new ArrayList<>();
for (String element : strArray) {
intList.add(new BigDecimal(element));
}
Collections.sort(intList);
System.out.println(intList);
}
Based on #SumitKumarSaha answer in Kotlin:
list.sortWith { left, right ->
if (left.length != right.length)
left.length - right.length
else
left.compareTo(right);
}

String.replace isn't working

import java.util.Scanner;
public class CashSplitter {
public static void main(String[] args) {
Scanner S = new Scanner(System.in);
System.out.println("Cash Values");
String i = S.nextLine();
for(int b = 0;b<i.length(); b ++){
System.out.println(b);
System.out.println(i.substring(0,i.indexOf('.')+3));
i.replace(i.substring(0, i.indexOf('.') + 3), "");
System.out.println(i);
System.out.println(i.substring(0, i.indexOf('.') + 3));
}
}
}
The code should be able to take a string with multiple cash values and split them up, into individual values. For example 7.32869.32 should split out 7.32, 869.32 etc
A string is immutable, therefore replace returns a new String for you to use
try
i = i.replace(i.substring(0, i.indexOf('.') + 3), "");
Although try using
https://docs.oracle.com/javase/7/docs/api/java/text/NumberFormat.html
There are several problems with your code:
You want to add two, not three, to the index of the decimal point,
You cannot use replace without assigning back to the string,
Your code assumes that there are no identical cash values.
For the last point, if you start with 2.222.222.22, you would get only one cash value instead of three, because replace would drop all three matches.
Java offers a nice way of splitting a String on a regex:
String[] parts = S.split("(?<=[.]..)")
Demo.
The regex is a look-behind that expects a dot followed by any two characters.

Using regex to define masks for numbers in Java

I am trying to define a set of rules, that will compute a mask based on the number it is given. For example I am trying to return a mask of 8472952424 of any number that start with 12, 13, 14, Or return 847235XXXX for any number that starts with 7 or 8.
The input numbers are 4 digit Integers and the return is a String. Do I need to convert the integers to string before I do the regex on them, and I am also not sure how to construct the expressions.
Edit
I have too much criteria to be done using separate if statements for each case. I am matching extension numbers to masks so it could be inserted correctly on Cisco CallManager database (in case you are curious)
Edit
This is what I have done for one of the cases but this is still not matching correctly:
public String lookupMask(int ext){
//convert to String
StringBuilder sb = new StringBuilder();
sb.append(ext);
String extString = sb.toString();
//compile and match pattern
Pattern p = Pattern.compile("^[12|13|14|15|17|19|42]");
Matcher m = p.matcher(extString);
if(m.matches()){
return "8472952424";
}
return null;
}
An example with Pattern could be this:
package test;
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Pattern;
public class Main {
// working Pattern
private static final Pattern PATTERN = Pattern.compile("^((1[234579])|42)");
// Your Pattern won't work because although it takes in account the start of the
// input, the OR within a character class does not exempt you to write round brackets
// around sequential characters such as "12".
// In fact here, the OR will be interpreted as the "|" character in the class, thus
// allowing it as a start character.
private static final Pattern NON_WORKING_PATTERN = Pattern.compile("^[12|13|14|15|17|19|42]");
private static final String STARTS_WITH_1_234 = "8472952424";
private static final String STARTS_WITH_ANYTHING_ELSE = "847295XXXX";
public static void main(String[] args) {
// NON_WORKING_PATTERN "works" on "33333"
System.out.println(NON_WORKING_PATTERN.matcher("33333").find());
int[] testIntegers = new int[]{1200, 1300, 1400, 1500, 1700, 1900, 4200, 0000};
List<String> results = new ArrayList<String>();
for (int test: testIntegers) {
if (PATTERN.matcher(String.valueOf(test)).find()) {
results.add(STARTS_WITH_1_234);
}
else {
results.add(STARTS_WITH_ANYTHING_ELSE);
}
}
System.out.println(results);
}
}
Output:
true
[8472952424, 8472952424, 8472952424, 8472952424, 8472952424, 8472952424, 8472952424, 847295XXXX]

Categories