This question already has answers here:
Parsing an arithmetic expression and building a tree from it in Java
(5 answers)
Closed 8 years ago.
I have a math expression stored as a String:
String math = "12+3=15";
I want to separate the string into the following:
int num1 (The first number, 12)
int num2 (The second number, 3)
String operator (+)
int answer (The answer, 15)
(num1 and num2 can be digits between 0-20, and operator can be either +,-,*,/)
What is the easiest way to achieve this? I was thinking about regular expressions, but I'm not sure how to do it.
Now, don't scowl at me.. You asked for the simplest solution :P
public static void main(String[] args) {
String math = "12+3=15";
Pattern p = Pattern.compile("(\\d+)(.)(\\d+)=(\\d+)");
Matcher m = p.matcher(math);
while (m.find()) {
System.out.println(m.group(1));
System.out.println(m.group(2));
System.out.println(m.group(3));
System.out.println(m.group(4));
}
}
O/P :
12
+
3
15
EDIT : (\\d+)(.)(\\d+)=(\\d+) -->
\\d+ matches one or more digits.
. matches anything
() --> captures whatever is inside it
(\\d+)(.)(\\d+)=(\\d+) --> captures one or more digits followed by anything (+-* etc) then again one or more digits and ignores the "=" and then captures digits again.
captured groups are named from 1 to n.. group 0 represents the entire string.
\\b(\\d+)\\b|([+*\/-])
You can simply do this and grab the capture.See demo.
https://regex101.com/r/wU7sQ0/30
Or simply split by \\b.See demo.
https://regex101.com/r/wU7sQ0/31
var re = /\b(\d+)\b|([+=\\-])/gm;
var str = '12+3=15';
var m;
while ((m = re.exec(str)) != null) {
if (m.index === re.lastIndex) {
re.lastIndex++;
}
// View your result using the m-variable.
// eg m[0] etc.
}
Related
This question already has answers here:
How to split a string, but also keep the delimiters?
(24 answers)
How do I split a string in Java?
(39 answers)
Closed 3 years ago.
Consider the following example. I would like to divide the String into two parts by the char 'T'
// input
String toDivideStr = "RaT15544";
// output
first= "RaT";
second = "15544";
I've tried this:
String[] first = toDivideStr.split("T",0);
Output:
first = "Ra"
second = "15544"
How do I achieve this?
What you need to to, is locate the last "T", then split:
StringToD.substring(StringToD.lastIndexOf("T") + 1)
You could use a positive lookahead to assert a digit and a positive lookbehind to assert RaT.
(?<=RaT)(?=\\d)
For example:
String str = "RaT15544";
for (String element : str.split("(?<=RaT)(?=\\d)"))
System.out.println(element);
Regex demo | Java demo
You can use positive look-ahead with split limit parameter for this. (?=\\d)
With only T in the split method parameter, what happens is the regex engine consumes this T. Hence the two string split that occurs doesn't have T. To avoid consuming the characters, we can use non-consumeing look-ahead.
(?=\\d) - This will match the first number that is encountered but it will not consume this number
public static void main(String[] args) {
String s = "RaT15544";
String[] ss = s.split("(?=\\d)", 2);
System.out.println(ss[0] + " " + ss[1]);
}
The below regex can be used to split the alphabets and numbers separately.
String StringToD = "RaT15544";
String[] parts = StringToD.split("(?<=\\d)(?=\\D)|(?<=\\D)(?=\\d)");
System.out.println(parts[0]);
System.out.println(parts[1]);
No question on SO addresses my particular problem. I know very little about regular expression. I am building an expression parser in Java using Regex Class for that purpose. I want to extract Operands, Arguments, Operators, Symbols and Function Names from expression and then save to ArrayList. Currently I am using this logic
String string = "2!+atan2(3+9,2+3)-2*PI+3/3-9-12%3*sin(9-9)+(2+6/2)" //This is just for testing purpose later on it will be provided by user
List<String> res = new ArrayList<>();
Pattern pattern = Pattern.compile((\\Q^\\E|\\Q/\\E|\\Q-\\E|\\Q-\\E|\\Q+\\E|\\Q*\\E|\\Q)\\E|\\Q)\\E|\\Q(\\E|\\Q(\\E|\\Q%\\E|\\Q!\\E)) //This string was build in a function where operator names were provided. Its mean that user can add custom operators and custom functions
Matcher m = pattern.matcher(string);
int pos = 0;
while (m.find())
{
if (pos != m.start())
{
res.add(string.substring(pos, m.start()))
}
res.add(m.group())
pos = m.end();
}
if (pos != string.length())
{
addToTokens(res, string.substring(pos));
}
for(String s : res)
{
System.out.println(s);
}
Output:
2
!
+
atan2
(
3
+
9
,
2
+
3
)
-
2
*
PI
+
3
/
3
-
9
-
12
%
3
*
sin
(
9
-
9
)
+
(
2
+
6
/
2
)
Problem is that now Expression can contain Matrix with user defined format. I want to treat every Matrix as a Operand or Argument in case of functions.
Input 1:
String input_1 = "2+3-9*[{2+3,2,6},{7,2+3,2+3i}]+9*6"
Output Should be:
2
+
3
-
9
*
[{2+3,2,6},{7,2+3,2+3i}]
+
9
*
6
Input 2:
String input_2 = "{[2,5][9/8,func(2+3)]}+9*8/5"
Output Should be:
{[2,5][9/8,func(2+3)]}
+
9
*
8
/
5
Input 3:
String input_3 = "<[2,9,2.36][2,3,2!]>*<[2,3,9][23+9*8/8,2,3]>"
Output Should be:
<[2,9,2.36][2,3,2!]>
*
<[2,3,9][23+9*8/8,2,3]>
I want that now ArrayList should contain every Operand, Operators, Arguments, Functions and symbols at each index. How can I achieve my desired output using Regular expression. Expression validation is not required.
I think you can try with something like:
(?<matrix>(?:\[[^\]]+\])|(?:<[^>]+>)|(?:\{[^\}]+\}))|(?<function>\w+(?=\())|(\d+[eE][-+]\d+)|(?<operand>\w+)|(?<operator>[-+\/*%])|(?<symbol>.)
DEMO
elements are captured in named capturing groups. If you don't need it, you can use short:
\[[^\]]+\]|<[^>]+>|\{[^\}]+\}|\d+[eE][-+]\d+|\w+(?=\()|\w+|[-+\/*%]|.
The \[[^\]]+\]|<[^>]+>|\{[^\}]+\} match opening bracket ({, [ or <), non clasing bracket characters, and closing bracket (},],>) so if there are no nested same-type brackets, there is no problem.
Implementatin in Java:
public class Test {
public static void main(String[] args) {
String[] expressions = {"2!+atan2(3+9,2+3)-2*PI+3/3-9-12%3*sin(9-9)+(2+6/2)", "2+3-9*[{2+3,2,6},{7,2+3,2+3i}]+9*6",
"{[2,5][9/8,func(2+3)]}+9*8/5","<[2,9,2.36][2,3,2!]>*<[2,3,9][23 + 9 * 8 / 8, 2, 3]>"};
Pattern pattern = Pattern.compile("(?<matrix>(?:\\[[^]]+])|(?:<[^>]+>)|(?:\\{[^}]+}))|(?<function>\\w+(?=\\())|(?<operand>\\w+)|(?<operator>[-+/*%])|(?<symbol>.)");
for(String expression : expressions) {
List<String> elements = new ArrayList<String>();
Matcher matcher = pattern.matcher(expression);
while (matcher.find()) {
elements.add(matcher.group());
}
for (String element : elements) {
System.out.println(element);
}
System.out.println("\n\n\n");
}
}
}
Explanation of alternatives:
\[[^\]]+\]|<[^>]+>|\{[^\}]+\} - match opening bracket of given
type, character which are not closing bracket of that type
(everything byt not closing bracket), and closing bracket of that
type,
\d+[eE][-+]\d+ = digit, followed by e or E, followed by operator +
or -, followed by digits, to capture elements like 2e+3
\w+(?=\() - match one or more word characters (A-Za-z0-9_) if it is
followed by ( for matching functions like sin,
\w+ - match one or more word characters (A-Za-z0-9_) for matching
operands,
[-+\/*%] - match one character from character class, to match
operators
. - match any other character, to match other symbols
Order of alternatives is quite important, as last alternative . will match any character, so it need to be last option. Similar case with \w+(?=\() and \w+, the second one will match everything like previous one, however if you don't wont to distinguish between functions and operands, the \w+ will be enough for all of them.
In longer exemple the part (?<name> ... ) in every alternative, is a named capturing group, and you can see in demo, how it group matched fragments in gorups like: operand, operator, function, etc.
With regular expressions you cannot match any level of nested balanced parentheses.
For example, in your second example {[2,5][9/8,func(2+3)]} you need to match the opening brace with the close brace, but you need to keep track of how many opening and closing inner braces/parens/etc there are. That cannot be done with regular expressions.
If, on the other hand, you simplify your problem to remove any requirement for balancing, then you probably can handle with regular expressions.
This question already has answers here:
What do 'lazy' and 'greedy' mean in the context of regular expressions?
(13 answers)
Closed 8 years ago.
just experiencing some problems with Java Regular expressions.
I have a program that reads through an HTML file and replaces any string inside the #VR# characters, i.e. #VR#Test1 2 3 4#VR#
However my issue is that, if the line contains more than two strings surrounded by #VR#, it does not match them. It would match the leftmost #VR# with the rightmost #VR# in the sentence and thus take whatever is in between.
For example:
#VR#Google#VR#
My code would match
URL-GOES-HERE#VR#" target="_blank" style="color:#f4f3f1; text-decoration:none;" title="ContactUs">#VR#Google
Here is my Java code. Would appreciate if you could help me to solve this:
Pattern p = Pattern.compile("#VR#.*#VR#");
Matcher m;
Scanner scanner = new Scanner(htmlContent);
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
m = p.matcher(line);
StringBuffer sb = new StringBuffer();
while (m.find()) {
String match_found = m.group().replaceAll("#VR#", "");
System.out.println("group: " + match_found);
}
}
I tried replacing m.group() with m.group(0) and m.group(1) but nothing. Also m.groupCount() always returns zero, even if there are two matches as in my example above.
Thanks, your help will be very much appreciated.
Your problem is that .* is "greedy"; it will try to match as long a substring as possible while still letting the overall expression match. So, for example, in #VR# 1 #VR# 2 #VR# 3 #VR#, it will match 1 #VR# 2 #VR# 3.
The simplest fix is to make it "non-greedy" (matching as little as possible while still letting the expression match), by changing the * to *?:
Pattern p = Pattern.compile("#VR#.*?#VR#");
Also m.groupCount() always returns zero, even if there are two matches as in my example above.
That's because m.groupCount() returns the number of capture groups (parenthesized subexpressions, whose corresponding matched substrings retrieved using m.group(1) and m.group(2) and so on) in the underlying pattern. In your case, your pattern has no capture groups, so m.groupCount() returns 0.
You can try the regular expression:
#VR#(((?!#VR#).)+)#VR#
Demo:
private static final Pattern REGEX_PATTERN =
Pattern.compile("#VR#(((?!#VR#).)+)#VR#");
public static void main(String[] args) {
String input = "#VR#Google#VR# ";
System.out.println(
REGEX_PATTERN.matcher(input).replaceAll("$1")
); // prints "Google "
}
I am working on a personal project and I want to take in userinput that looks like this :
1.0+2.5+3--4
and format it to something like this :
1.0 + 2.5 + 3 - -4
so far I am using the .replace("+") to .replace(" + ") and doing that for all of the operands but the problem is it makes the user input into this:
1.0 + 2.5 + 3 - - 4
Is there a way that I can make it with the negative signs. I want to do this so I could parse the numbers into doubles and add and subtract them later on.
my code for it :
import java.util.Scanner;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class StringMan {
/**
* #param args
*/
public static void main(String[] args) {
// TODO Auto-generated method stub
String check = "-a1 +a2 + a3 +-a5";
check = check.replace("--", "+");
System.out.println(check);
Pattern pattern = Pattern.compile("\\s+");
Matcher matcher = pattern.matcher(check);
boolean expr = matcher.find();
String str = matcher.replaceAll(" ");
System.out.println(str);
}
}
output is:
-a1 +a2 - a3 +-a5
-a1 +a2 - a3 +-a5
the problem is I want the output to look like this:
-a1 + a2 - a3 + -a5
In this specific case, you can handle -- by just replacing them with +:
Take input as a string from the user
Remove all white space
Replace all -- with +
Continue parsing as desired
I would recommend using regular expressions and their "group" functionality. I would actually remove all whitespace to make things easier, take it out of the equation, one less thing to deal with. And obviously I would recommend simplifying the string, replacing "--" with "+", "*+" with "*" and so on.
now you can use a regex on your cleaned up string.
Pattern firstPat = Pattern.compile("(((\\+|-)?)\\d+(.\\d+)?)");//for matching the first number, leading sign is optional
Pattern remainingPat = Pattern.compile("(\\+|-)(\\d+(.\\d+)?)");//for remaining numbers, leading sign is mandatory.
Pattern remainingPatWithExtOps = Pattern.compile("(\\*|/|\\+|-)(-?\\d+(.\\d+)?)");//for remaining numbers, accommodating multiply and divide with negative signs(positive signs should have been cleaned out)
Matcher match = firstPat.matcher(inputString);
now you can iterate through the string using the match.find() method. and then use match.group(1) to get the sign/operation, and use match.group(2) to get the number...
So...
Double firstnum;
boolean firstNumSigned = false;
if(match.find())
{
firstNum = Double.parse(match.group(0));// Parsing handles possible sign in string.
//obv check for exceptions during this and double check group num
String tmp = match.group(1);
firstNumSigned = tmp.equals("+") || tmp.equals("-");
}
else
{//no match means the input was probably invalid....
throw new IllegalArgumentException("What the heck were you thinking inputting that?!");
}
match = remainingPat.matcher(inputString);//use our other pattern for remaining numbers
if(firstNumSigned)
{
match.find();//a signed first number will cause success here, we need to ignore this since we already got the first number
}
Double tmpRemaingingNum;
String operation;
while(match.find())
{
operation = match.group(1);
tmpRemainingNum = Double.parse(match.group(2));
//Do what you want with these values now until match.find() returns false and you are done
}
PS: code is not tested, im fairly confident of the regex, but I'm not 100% sure about the grouping brackets on the first pattern.. might need to experiment
Start by replacing -- with +, which is mathematically equivalent. Or start by replacing -- with - -, which would keep - and 4 together.
Check this ,
Read both strings and integers in between operators like '*,--,-,+"
We can read both integers and characters.
public static void main(String[] args) {
// TODO Auto-generated method stub
final Pattern remainingPatWithExt=Pattern.compile("(\\p{L}\\p{M}*)[\\p{L}\\p{M}0-9^\\-.-?_+-=<>!;]*");
String check = "a1+a2+--a7+ a3 +-a5";
Matcher matcher = remainingPatWithExt.matcher(check);
while( matcher.find())
{
System.out.println(matcher.group());
//use matcher.group(0) or matcher.group(1)
}
}
output
a1
a2
a7
a3
a5
This question already has answers here:
Java RegEx meta character (.) and ordinary dot?
(9 answers)
Closed 7 years ago.
What is the regular expression for . and .. ?
if(key.matches(".")) {
do something
}
The matches accepts String which asks for regular expression. Now i need to remove all DOT's inside my MAP.
. matches any character so needs escaping i.e. \., or \\. within a Java string (because \ itself has special meaning within Java strings.)
You can then use \.\. or \.{2} to match exactly 2 dots.
...
[.]{1}
or
[.]{2}
?
[+*?.] Most special characters have no meaning inside the square brackets. This expression matches any of +, *, ? or the dot.
Use String.Replace() if you just want to replace the dots from string. Alternative would be to use Pattern-Matcher with StringBuilder, this gives you more flexibility as you can find groups that are between dots. If using the latter, i would recommend that you ignore empty entries with "\\.+".
public static int count(String str, String regex) {
int i = 0;
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(str);
while (m.find()) {
m.group();
i++;
}
return i;
}
public static void main(String[] args) {
int i = 0, j = 0, k = 0;
String str = "-.-..-...-.-.--..-k....k...k..k.k-.-";
// this will just remove dots
System.out.println(str.replaceAll("\\.", ""));
// this will just remove sequences of ".." dots
System.out.println(str.replaceAll("\\.{2}", ""));
// this will just remove sequences of dots, and gets
// multiple of dots as 1
System.out.println(str.replaceAll("\\.+", ""));
/* for this to be more obvious, consider following */
System.out.println(count(str, "\\."));
System.out.println(count(str, "\\.{2}"));
System.out.println(count(str, "\\.+"));
}
The output will be:
--------kkkkk--
-.--.-.-.---kk.kk.k-.-
--------kkkkk--
21
7
11
You should use contains not matches
if(nom.contains("."))
System.out.println("OK");
else
System.out.println("Bad");