How to parse string with Java? - java

I am trying to make a simple calculator application that would take a string like this
5 + 4 + 3 - 2 - 10 + 15
I need Java to parse this string into an array
{5, +4, +3, -2, -10, +15}
Assume the user may enter 0 or more spaces between each number and each operator
I'm new to Java so I'm not entirely sure how to accomplish this.

You can use Integer.parseInt to get the values, splitting the string you can achieve with String class. A regex could work, but I dont know how to do those :3

Take a look at String.split():
String str = "1 + 2";
System.out.println(java.util.Arrays.toString(str.split(" ")));
[1, +, 2]
Note that split uses regular expressions, so you would have to quote the character to split by "." or similar characters with special meanings. Also, multiple spaces in a row will create empty strings in the parse array which you would need to skip.
This solves the simple example. For more rigorous parsing of true expressions you would want to create a grammar and use something like Antlr.

Let str be your line buffer.
Use Regex.match for pattern ([-+]?[ \t]*[0-9]+).
Accumulate all matches into String[] tokens.
Then, for each token in tokens:
String s[] = tokens[i].split(" +");
if (s.length > 1)
tokens[i] = s[0] + s[1];
else
tokens[i] = s[0];

You can use positive lookbehind:
String s = "5 + 4 + 3 - 2 - 10 + 15";
Pattern p = Pattern.compile("(?<=[0-9]) +");
String[] result = p.split(s);
for(String ss : result)
System.out.println(ss.replaceAll(" ", ""));

String cal = "5 + 4 + 3 - 2 - 10 + 15";
//matches combinations of '+' or '-', whitespace, number
Pattern pat = Pattern.compile("[-+]{1}\\s*\\d+");
Matcher mat = pat.matcher(cal);
List<String> ops = new ArrayList<String>();
while(mat.find())
{
ops.add(mat.group());
}
//gets first number and puts in beginning of List
ops.add(0, cal.substring(0, cal.indexOf(" ")));
for(int i = 0; i < ops.size(); i++)
{
//remove whitespace
ops.set(i, ops.get(i).replaceAll("\\s*", ""));
}
System.out.println(Arrays.toString(ops.toArray()));
//[5, +4, +3, -2, -10, +15]

Based off the input of some of the answers here, I found this to be the best solution
// input
String s = "5 + 4 + 3 - 2 - 10 + 15";
ArrayList<Integer> numbers = new ArrayList<Integer>();
// remove whitespace
s = s.replaceAll("\\s+", "");
// parse string
Pattern pattern = Pattern.compile("[-]?\\d+");
Matcher matcher = pattern.matcher(s);
// add numbers to array
while (matcher.find()) {
numbers.add(Integer.parseInt(matcher.group()));
}
// numbers
// {5, 4, 3, -2, -10, 15}

Related

(hello-> h3o) How to replace in a String the middle letters for the number of letters replaced

I need to build a method which receive a String e.g. "elephant-rides are really fun!". and return another similar String, in this example the return should be: "e6t-r3s are r4y fun!". (because e-lephan-t has 6 middle letters, r-ide-s has 3 middle letters and so on)
To get that return I need to replace in each word the middle letters for the number of letters replaced leaving without changes everything which isn't a letter and the first and the last letter of every word.
for the moment I've tried using regex to split the received string into words, and saving these words in an array of strings also I have another array of int in which I save the number of middle letters, but I don't know how to join both arrays and the symbols into a correct String to return
String string="elephant-rides are really fun!";
String[] parts = string.split("[^a-zA-Z]");
int[] sizes = new int[parts.length];
int index=0;
for(String aux: parts)
{
sizes[index]= aux.length()-2;
System.out.println( sizes[index]);
index++;
}
You may use
String text = "elephant-rides are really fun!";
Pattern r = Pattern.compile("(?U)(\\w)(\\w{2,})(\\w)");
Matcher m = r.matcher(text);
StringBuffer sb = new StringBuffer();
while (m.find()) {
m.appendReplacement(sb, m.group(1) + m.group(2).length() + m.group(3));
}
m.appendTail(sb); // append the rest of the contents
System.out.println(sb);
// => e6t-r3s are r4y fun!
See the Java demo
Here, (?U)(\\w)(\\w{2,})(\\w) matches any Unicode word char capturing it into Group 1, then captures any 2 or more word chars into Group 2 and then captures a single word char into Group 3, and inside the .appendReplacement method, the second group contents are "converted" into its length.
Java 9+:
String text = "elephant-rides are really fun!";
Pattern r = Pattern.compile("(?U)(\\w)(\\w{2,})(\\w)");
Matcher m = r.matcher(text);
String result = m.replaceAll(x -> x.group(1) + x.group(2).length() + x.group(3));
System.out.println( result );
// => e6t-r3s are r4y fun!
For the instructions you gave us, this would be sufficient:
String [] result = string.split("[\\s-]");
for (int i=0; i<result.length; i++){
result[i] = "" + result[i].charAt(0) + ((result[i].length())-2) + result[i].charAt(result[i].length()-1);
}
With your input, it creates the array [ "e6t", "r3s", "a1e", "r4y", "f2!" ]
And it works even with one or two sized words, but it gives result such as:
Input: I am a small; Output: [ "I-1I", "a0m", "a-1a", "s3l" ]
Again, for the instructions you gave us this would be legal.
Hope I helped!

confused how .split() works in Java

I have this string which I am taking in from a text file.
"1 normal 1 [(o, 21) (o, 17) (t, 3)]"
I want to take in 1, normal, 1, o, 21, 17, t, 3 in a string array.
Scanner inFile = new Scanner(new File("input.txt");
String input = inFile.nextLine();
String[] tokens = input.split(" |\\(|\\)|\\[\\(|\\, |\\]| \\(");
for(int i =0 ; i<tokens.length; ++i)
{
System.out.println(tokens[i]);
}
Output:
1
normal
1
o
21
o
17
t
3
Why are there spaces being stored in the array.
That's not spaces, that's empty strings. Your string is:
"1 normal 1 [(o, 21) (o, 17) (t, 3)]"
It's split in the following way according to your regexp:
Token = "1"
Delimiter = " "
Token = "normal"
Delimiter = " "
Token = "1"
Delimiter = " "
Token = "" <-- empty string
Delimiter = "[("
Token = "o"
... end so on
When two adjacent delimiters appear, it's considered that there's an empty string token between them.
To fix this you may change your regexp, for example, like this:
"[ \\(\\)\\[\\,\\]]+"
Thus any number of " ()[,]" adjacent characters will be considered as a delimiter.
For example here:
1 [(o
At first step it matches a single space.
The next step it matches [(
So between these two matching, a void String "" is returned.

Parsing a string with [3:0] substring in it

I want to store two numbers from a string into two distinct variables - for example, var1 = 3 and var2 = 0 from "[3:0]". I have the following code snippet:
String myStr = "[3:0]";
if (myStr.trim().matches("\\[(\\d+)\\]")) {
// Do something.
// If it enter the here, here I want to store 3 and 0 in different variables or an array
}
Is it possible doing this with split and regular expressions?
Don't call trim(). Enhance you regex instead.
Your regex is missing the pattern for : and the second number, and you don't need to escape the ].
To capture the matched numbers, you need the Matcher:
String myStr = " [3:0] ";
Matcher m = Pattern.compile("\\s*\\[(\\d+):(\\d+)]\\s*").matcher(myStr);
if (m.matches())
System.out.println(m.group(1) + ", " + m.group(2));
Output
3, 0
You can use replaceAll and split
String myStr = "[3:0]";
if(myStr.trim().matches("\\[\\d+:\\d+\\]") {
String[] numbers = myStr.replaceAll("[\\[\\]]","").split(":");
}
Moreover, your regExp to match String should be \\[\\d+:\\d+\\], if you want to avoid trim you can add \\s+ at start and end to match the spaces.But trim is not bad.
EDIT
As suggested by Andreas in comments,
String myStr = "[3:0]";
String regExp = "\\[(\\d+):(\\d+)\\]";
Pattern pattern = Pattern.compile(regExp);
Matcher matcher = pattern.matcher(myStr.trim());
if(matcher.find()) {
int a = Integer.parseInt(matcher.group(1));
int b = Integer.parseInt(matcher.group(2));
System.out.println(a + " : " + b);
}
OUTPUT
3 : 0
Without any regular expressions you could do this:
// this will remove the braces [ and ] and just leave "3:0"
String numberString= myString.trim().replace("[", "").replace("]","");
// this will split the string in everything before the : and everything after the : (so two values as an array)
String[] numbers = numberString.split(":");
// get the first value and parse it as a number "3" will become a simple 3
int firstNumber = Integer.parseInt(numbers[0]) ;
// get the second value and parse it from "0" to a plain 0
int secondNumber = Integer.parseInt(numbers[1]);
be carefull when parsing numbers, depending on your input string and what other possibilities there might be (e.g. "3:12" is ok, but "3:02" might throw an error).
In case you don't need to validate input and you want to simply get numbers from it, you could simply find indexOf(":") and substring parts which you are interested, in which are:
from [ (which is at position 0) till :
and from index of : till ] (which is at position equal to length of string -1)
Your code can look like
String text = "[3:0]";
int colonIndex = text.indexOf(':');
String first = text.substring(1, colonIndex);
String second = text.substring(colonIndex + 1, text.length() - 1);

Use closing parenthesis() as delimiters in multiple Regular Expressions

how to use this closing parenthesis () in this group of delimiters :
group one :
(?<=[-+*/%])
group tow :
(?=[-+*/%])
My code is :
String str = " (31 + 4) -6+ 1 % 7";
String [] result = str.split("(?<=[-+*/%])|(?=[-+*/%]) " ) ;
System.out.println(Arrays.toString(result));
I want this closing parenthesis () act as other characters in brackets [] ,
with
look-behind assertion
(?<=...)
and look-ahead assertion
(?=...)
the output is :
[ (31, +, 4), -, 6, +, 1, %, 7]
the output needed :
[ ( , 31, +, 4 , ) , -, 6, +, 1, %, 7]
Adding the round brackets to both the character classes you have can help you achieve what you need. To remove the empty (space-only) elements, you can use the post-processing with .removeAll(Collections.singleton(" ")) run against an list of strings.
Thus, you can use the following code:
String s = " (31 + 4) -6+ 1 % 7";
String[] sp = s.split("(?<=[-+*/%()])|(?=[-+*/%()])\\s*");
List<String> list = new ArrayList<String>(Arrays.asList(sp));
list.removeAll(Collections.singleton(" "));
System.out.println(list);
See IDEONE demo
Another way mentioned by anubhava, with [0-9]+|[-+*/%()] regex that matches sequences of digits or the operators:
String s = " (31 + 4) -6+ 1 % 7";
Pattern pattern = Pattern.compile("[0-9]+|[-+*/%()]");
Matcher matcher = pattern.matcher(s);
List<String> res = new ArrayList<>();
while (matcher.find()){
res.add(matcher.group(0));
}
See another IDEONE demo

How to extract numbers from a string and get an array of ints?

I have a String variable (basically an English sentence with an unspecified number of numbers) and I'd like to extract all the numbers into an array of integers. I was wondering whether there was a quick solution with regular expressions?
I used Sean's solution and changed it slightly:
LinkedList<String> numbers = new LinkedList<String>();
Pattern p = Pattern.compile("\\d+");
Matcher m = p.matcher(line);
while (m.find()) {
numbers.add(m.group());
}
Pattern p = Pattern.compile("-?\\d+");
Matcher m = p.matcher("There are more than -2 and less than 12 numbers here");
while (m.find()) {
System.out.println(m.group());
}
... prints -2 and 12.
-? matches a leading negative sign -- optionally. \d matches a digit, and we need to write \ as \\ in a Java String though. So, \d+ matches 1 or more digits.
What about to use replaceAll java.lang.String method:
String str = "qwerty-1qwerty-2 455 f0gfg 4";
str = str.replaceAll("[^-?0-9]+", " ");
System.out.println(Arrays.asList(str.trim().split(" ")));
Output:
[-1, -2, 455, 0, 4]
Description
[^-?0-9]+
[ and ] delimites a set of characters to be single matched, i.e., only one time in any order
^ Special identifier used in the beginning of the set, used to indicate to match all characters not present in the delimited set, instead of all characters present in the set.
+ Between one and unlimited times, as many times as possible, giving back as needed
-? One of the characters “-” and “?”
0-9 A character in the range between “0” and “9”
Pattern p = Pattern.compile("[0-9]+");
Matcher m = p.matcher(myString);
while (m.find()) {
int n = Integer.parseInt(m.group());
// append n to list
}
// convert list to array, etc
You can actually replace [0-9] with \d, but that involves double backslash escaping, which makes it harder to read.
StringBuffer sBuffer = new StringBuffer();
Pattern p = Pattern.compile("[0-9]+.[0-9]*|[0-9]*.[0-9]+|[0-9]+");
Matcher m = p.matcher(str);
while (m.find()) {
sBuffer.append(m.group());
}
return sBuffer.toString();
This is for extracting numbers retaining the decimal
The accepted answer detects digits but does not detect formated numbers, e.g. 2,000, nor decimals, e.g. 4.8. For such use -?\\d+(,\\d+)*?\\.?\\d+?:
Pattern p = Pattern.compile("-?\\d+(,\\d+)*?\\.?\\d+?");
List<String> numbers = new ArrayList<String>();
Matcher m = p.matcher("Government has distributed 4.8 million textbooks to 2,000 schools");
while (m.find()) {
numbers.add(m.group());
}
System.out.println(numbers);
Output:
[4.8, 2,000]
Using Java 8, you can do:
String str = "There 0 are 1 some -2-34 -numbers 567 here 890 .";
int[] ints = Arrays.stream(str.replaceAll("-", " -").split("[^-\\d]+"))
.filter(s -> !s.matches("-?"))
.mapToInt(Integer::parseInt).toArray();
System.out.println(Arrays.toString(ints)); // prints [0, 1, -2, -34, 567, 890]
If you don't have negative numbers, you can get rid of the replaceAll (and use !s.isEmpty() in filter), as that's only to properly split something like 2-34 (this can also be handled purely with regex in split, but it's fairly complicated).
Arrays.stream turns our String[] into a Stream<String>.
filter gets rid of the leading and trailing empty strings as well as any - that isn't part of a number.
mapToInt(Integer::parseInt).toArray() calls parseInt on each String to give us an int[].
Alternatively, Java 9 has a Matcher.results method, which should allow for something like:
Pattern p = Pattern.compile("-?\\d+");
Matcher m = p.matcher("There 0 are 1 some -2-34 -numbers 567 here 890 .");
int[] ints = m.results().map(MatchResults::group).mapToInt(Integer::parseInt).toArray();
System.out.println(Arrays.toString(ints)); // prints [0, 1, -2, -34, 567, 890]
As it stands, neither of these is a big improvement over just looping over the results with Pattern / Matcher as shown in the other answers, but it should be simpler if you want to follow this up with more complex operations which are significantly simplified with the use of streams.
for rational numbers use this one: (([0-9]+.[0-9]*)|([0-9]*.[0-9]+)|([0-9]+))
Extract all real numbers using this.
public static ArrayList<Double> extractNumbersInOrder(String str){
str+='a';
double[] returnArray = new double[]{};
ArrayList<Double> list = new ArrayList<Double>();
String singleNum="";
Boolean numStarted;
for(char c:str.toCharArray()){
if(isNumber(c)){
singleNum+=c;
} else {
if(!singleNum.equals("")){ //number ended
list.add(Double.valueOf(singleNum));
System.out.println(singleNum);
singleNum="";
}
}
}
return list;
}
public static boolean isNumber(char c){
if(Character.isDigit(c)||c=='-'||c=='+'||c=='.'){
return true;
} else {
return false;
}
}
Fraction and grouping characters for representing real numbers may differ between languages. The same real number could be written in very different ways depending on the language.
The number two million in German
2,000,000.00
and in English
2.000.000,00
A method to fully extract real numbers from a given string in a language agnostic way:
public List<BigDecimal> extractDecimals(final String s, final char fraction, final char grouping) {
List<BigDecimal> decimals = new ArrayList<BigDecimal>();
//Remove grouping character for easier regexp extraction
StringBuilder noGrouping = new StringBuilder();
int i = 0;
while(i >= 0 && i < s.length()) {
char c = s.charAt(i);
if(c == grouping) {
int prev = i-1, next = i+1;
boolean isValidGroupingChar =
prev >= 0 && Character.isDigit(s.charAt(prev)) &&
next < s.length() && Character.isDigit(s.charAt(next));
if(!isValidGroupingChar)
noGrouping.append(c);
i++;
} else {
noGrouping.append(c);
i++;
}
}
//the '.' character has to be escaped in regular expressions
String fractionRegex = fraction == POINT ? "\\." : String.valueOf(fraction);
Pattern p = Pattern.compile("-?(\\d+" + fractionRegex + "\\d+|\\d+)");
Matcher m = p.matcher(noGrouping);
while (m.find()) {
String match = m.group().replace(COMMA, POINT);
decimals.add(new BigDecimal(match));
}
return decimals;
}
If you want to exclude numbers that are contained within words, such as bar1 or aa1bb, then add word boundaries \b to any of the regex based answers. For example:
Pattern p = Pattern.compile("\\b-?\\d+\\b");
Matcher m = p.matcher("9There 9are more9 th9an -2 and less than 12 numbers here9");
while (m.find()) {
System.out.println(m.group());
}
displays:
2
12
I would suggest to check the ASCII values to extract numbers from a String
Suppose you have an input String as myname12345 and if you want to just extract the numbers 12345 you can do so by first converting the String to Character Array then use the following pseudocode
for(int i=0; i < CharacterArray.length; i++)
{
if( a[i] >=48 && a[i] <= 58)
System.out.print(a[i]);
}
once the numbers are extracted append them to an array
Hope this helps
I found this expression simplest
String[] extractednums = msg.split("\\\\D++");
public static String extractNumberFromString(String number) {
String num = number.replaceAll("[^0-9]+", " ");
return num.replaceAll(" ", "");
}
extracts only numbers from string

Categories