This question already has answers here:
Recursive expression evaluator using Java
(6 answers)
Closed 7 years ago.
I need to make a java program that evaluates an expression from an input file and returns the result in an output file. It needs to consider operator precedence, unary and binary operators, bracket matching, and has to rely on recursion only (no stacks or queues).
I've been thinking about this all night, and it frustrates me. I'm not asking for an entire java program written for me. I just need some guidance. I started by writing some pseudo-code, but I don't think it's any good:
Input: the text file to read each expression from.
Output: the text file that repeats each expression, as well as printing the result.
Algorithm SecondCalc()
{
input = “expressions.txt”;
output = “out.txt”;
if (input.currentLine has something)
{
line = input.currentLine;
output.write(line);
line = line.replace(“-space-”, “”);
evaluate(line);
//...to be continued
}
}
Algorithm evaluate(line)
{
for(i = 0 to line.length)
{
if(i == “(” or “)” ) exit loop;
if(i == “!”) exit loop;
if(i == “^”) exit loop;
if(i == “*” or “/” ) exit loop;
if(i == “+” or “-” ) exit loop;
if(i == “>” or “>=” or “<” or “<=” ) exit loop;
if(i == “==” or “!=” ) exit loop;
if(i == “$”) exit loop;
}
temp1 = line from index 0 to i;
temp2 = line from index i + 1 to line.length;
if(i == “!”) then evaulate(temp1!);
//...to be continued
}
Any tips would be appreciated. Thanks.
well the first thing I notice is that you say want operator precedence but in your evaluate you ignore operator precedence by essentially doing first come first serve which treats them all with the same precedence. If your aim indeed is to simulate operator precedence (i assume the input is expected to look like java's expressions) then i suggest you either properly process certain operators first before you process others, or you re-arrange the input properly to match other styles like polish notation.
For both cases, i would do a similar process: instead of if statement after if statement in the for loop like you have now, try for loop after for loop where each for loop looks for a specific operator and "does something".
for(i = 0 to line.length)
{
if(i == “(” or “)” ) doSomething;
}
for(i = 0 to line.length)
{
if(i == “!” ) doSomething;
}
for(i = 0 to line.length)
{
if(i == “^”) doSomething;
}
for(i = 0 to line.length)
{
if(i == “*” or “/” ) doSomething;
}
for(i = 0 to line.length)
{
if(i == “+” or “-” ) doSomething;
}
for(i = 0 to line.length)
{
if(i == “>” or “>=” or “<” or “<=” ) doSomething;
}
for(i = 0 to line.length)
{
if(i == “==” or “!=” )doSomething;
}
for(i = 0 to line.length)
{
if(i == “$”)doSomething;
}
}
There's much to improve on, but hopefully this points you in the right direction.
I would suggest reading up on polish notation. It's a good way to store mathematical functions. For instance + cos a b -> cos(a) + b whereas cos + a b -> cos(a+b). There is no ambiguity. Additionally, terms to the right have precedence over terms to the left.
I wrote a symbolic logic manipulator long ago, and reading the strings is definitely hard. Here is what I would suggest for flow:
Look for binary operators that are outside of any parentheses. Start at the highest order of operations and work down.
When you find a binary operator, recursively call the stringtofunction on the arguments to either side of the binary operator. Anything between the binary operator you are looking at and any other binary operator outside parentheses or between the binary operator and the ends of the string counts as 1 object.
The return from part two goes into something like operator return1 return2 in polish notation.
When the outermost sides of the string are parentheses peel them off.
If you did not find any top level binary operators search for unary operators. Recursively call the argument of the unary operator and store it as operator return;
Related
This question already has answers here:
Semicolon at end of 'if' statement
(18 answers)
Closed 5 years ago.
boolean r = false ; int s = 0 ;
while (r == false) ;
{
s = getInt() ;
if (!(s>=0 && s<=2)) System.out.println ("try again not a valid response") ;
else r = true ;
}
The text never displays itself even when a 3 or a 123 is entered and the loop never terminates. Whats wrong here?
You have a semicolon after the condition. When you use braces to specify a block for your while you don't use a semicolon.
Remove the ';' after while.
Others have pointed out the bug, but your code is scary in other ways that will eventually trip you up:
if (!(s>=0 && s<=2)) System.out.println ("try again not a valid response") ;
else r = true ;
That's bad because you can easily intend more than one statement to run in the case of the if or else clause. Use curly braces and avoid placing conditional statements on a single line:
if (!(s>=0 && s<=2))
{
System.out.println ("try again not a valid response");
}
else
{
r = true;
}
It's easier to read and far less likely to introduce hard-to-see bugs.
while(r == false)
should be
while(!r)
Despite what everyone else said about the semicolon, that is what I think is wrong with it :)
+1 to Daniel DiPaolo. I thought I'd post a separate answer to provide clarification of why this is the case.
While loops in Java can be written in one of two ways. If there is just one line to the body of the loop, you can write them in a short-hand fashion:
while (true)
System.out.println("While loop");
This will print out "While loop" on the console until the program ends. The other option is to specify a loop body between braces, as you have done above:
int i = 0;
while (i < 10) {
System.out.println("i = " + i);
i++;
}
This will print out "i = 0", "i = 1", ..., "i = 9" each on a separate line.
What the code you posted does is confuse the two. In the short-hand while loop, the Java parser expects to find a statement between the while loop condition and the semi-colon. Because it does not find a statement here, the while loop runs, but does nothing; it has no body. Furthermore, because the loop has no body, there is no opportunity for your variable r to assume a new value; the condition always evaluates to true and the loop never exits.
If you were to negate the condition in the while loop in your example, i.e.,
boolean r = false ; int s = 0 ;
while (r != false) ;
{
s = getInt() ;
if (!(s>=0 && s<=2)) System.out.println ("try again not a valid response") ;
else r = true ;
}
(note I left the erroneous semicolon in there), you would find that your intended loop body would execute precisely once, as the loop would never run.
In addition to other comments, you should also change the if to
if (s < 0 || s > 2)
It's much more understandable this way.
Unrelated answer, I really really recommend you to follow Sun's style guidelines.
boolean r = false ;
int s = 0 ;
while (r == false) {
s = getInt() ;
if (!(s>=0 && s<=2)) {
System.out.println ("try again not a valid response") ;
} else {
r = true ;
}
}
You could get rid of the r variable and the if/else condition if you evaluate the result in the loop it self.
int s = 0;
while( ( s = getInt() ) < 0 || s > 2 ) {
System.out.println( "Try again, not a valid response");
}
As per the Java Language Specification:
Evaluation Respects Parentheses and Precedence
aside from using mathematical operation like:
int i = 3;
int j = 3 * (9 + 3);
System.out.println(j); //results to 36
are there any other examples that this rule apply? I tried using
int i = 0;
int z = 0;
if(i++ < 5 || (++z <0 && 5 > z++) || 6 < ++i){
System.out.println("Routed here");
}
System.out.println("i: " + i);
System.out.println("z: " + z);
but it results to i: 1 and z:0. It seems that the evaluation on that if example is still from left to right.
With ||, Java uses the concept of short circuiting, while evaluating expressions. Therefore, in this:
if(i++ < 5 || (++z <0 && 5 > z++) || 6 < ++i){
since the very first expression i++ < 5 returns true, hence rest of the expression will not be evaluated, i.e. never visited, hence will bring a change only in the value of i and no other thingy.
Quote from Java Docs:
The Conditional Operators
The && and || operators perform Conditional-AND and Conditional-OR operations on two boolean expressions. These operators exhibit "short-circuiting" behavior, which means that the second operand is evaluated only if needed.
&& Conditional-AND
|| Conditional-OR
The following program, ConditionalDemo1, tests these operators:
class ConditionalDemo1 {
public static void main(String[] args){
int value1 = 1;
int value2 = 2;
if((value1 == 1) && (value2 == 2))
System.out.println("value1 is 1 AND value2 is 2");
if((value1 == 1) || (value2 == 1))
System.out.println("value1 is 1 OR value2 is 1");
}
}
Parentheses and precedence don't have anything to do with the order in which expressions are evaluated at run time. That is, if you think putting parentheses around an expression means that it will get evaluated earlier, or that an operator with higher precedence is evaluated earlier, you're misunderstanding the concept.
Operator precedence answers questions like this: In the expression
a * b + c
Which operator does the compiler "bind" to arguments first? If + had a higher precedence, it would grab the arguments near it before * could, so to speak; so that the result would be that b and c are added, and the result is multiplied by a. That is, it would be equivalent to
a * (b + c)
But in most programming languages (with some exceptions such as APL), the * has higher precedence. That means the arguments are bound to * before they're bound to +, which means that a and b are multiplied, and the result is added to c, i.e.
(a * b) + c
Similarly, in the expression
a + b * c
the result is that b and c are multiplied, and the result is added to a.
a + (b * c)
You can put parentheses around parts of the expressions to change how the arguments are bound; thus, if you want to add a and b and multiply the sum by c, you can say
(a + b) * c
But it's very important to note: All this controls how the expression is interpreted at compile time. But at runtime, the arguments are always evaluated left-to-right. When the program is run, ALL of the above expressions will cause the program to evaluate a, then b, then c. This doesn't matter if a, b, and c are variables, but if they were method calls, it could possibly matter. In Java, when the program is run, things are always evaluated from left to right. (This is not true of other languages; most languages that I know of let the compiler choose the order.)
And when it comes to || and && (or similar operators in some other languages), once again at run time, the left argument is always evaluated first. The right argument may or may not be evaluated. Parentheses and operator precedence control how the expression is interpreted if you have an expression like some-expression-1 || some-expression-2 && some-expression-3, but they do not change the order of evaluation at run time.
As soon as the test can be evaluated to true or false the execution jumps inside the if block. In your case, i equals 0 which is lighter than than 5 so the test evaluates to true.
(true OR a OR b) is true independently from the values of a and b, which are not evaluated (and incrementations are not applied).
It would be the same with (false AND a AND b) except it would always skip the block.
Check the order of your expressions, split them into separate tests or get your incrementations out of the test. The following code is equivalent to your example but with the output you expected :
int i = 0;
int z = 1;
if(i < 5 || (z < 0 && 5 > z+2) || 6 < i+2){
System.out.println("Routed here");
}
i += 2;
z += 2;
System.out.println("i: " + i);
System.out.println("z: " + z);
I'm working on a project in Java that requires me to convert an infix expression to a postfix expression. I am currently able to convert infix expressions to postfix with this method as long as they don't contain parenthesis, but I can't figure out how to handle parenthesis.
Basically, I have two stacks that hold objects that are called 'Token'. A Token is a wrapper class that holds a string that is either a number, variable (which gets evaluated as a number, pending on user input), operator (the operator has a priority level associated with it so that my method can determine how to handle order of operations between '+', '-', '*' and '/'), or a parenthesis (the parenthesis has a way to determine if it is a open parenthesis or a closed parenthesis).
How should I handle parenthesis? What about multiple layers of parenthesis?
public String toPostFix() {
StringBuilder postfixstr = new StringBuilder();
Stack<Token> in_fix = new Stack<>();
Stack<Token> post_fix = new Stack<>();
for (int i = tokens.length - 1; i >= 0; i--) {
t = new Token(tokens[i]);
in_fix.push(t);
}
//there are still tokens to process
while (!in_fix.empty()) {
//is a number
if (in_fix.peek().type == 1) {
postfixstr.append(in_fix.pop().toString());
}
//is an operator and the stack is empty
else if (in_fix.peek().type == 3 && post_fix.empty()) {
post_fix.push(in_fix.pop());
}
// is an operator that has higher priority than the operator on the stack
else if (in_fix.peek().type == 3 && in_fix.peek().isOperator() > post_fix.peek().isOperator()) {
post_fix.push(in_fix.pop());
}
// is an operator that has lower priority than the operator on the stack
else if (in_fix.peek().type == 3 && in_fix.peek().isOperator() <= post_fix.peek().isOperator()) {
postfixstr.append(post_fix.pop());
post_fix.push(in_fix.pop());
}
//puts the rest of the stack onto the output string
if (in_fix.empty()) {
while (!post_fix.empty()) {
postfixstr.append(post_fix.pop());
}
}
}
return postfixstr.toString();
}
You need to push the left parenthesis onto the stack, and process the stack like so when you encounter a right parenthesis:
// opening (
if (in_fix.peek().type == 4) {
post_fix.push(in_fix.pop());
}
//closing )
if(in_fix.peek().type == 5){
while(!(post_fix.isEmpty() || post_fix.peek().type == 4)){
postfixstr.append(post_fix.pop());
}
if (post_fix.isEmpty())
; // ERROR - unmatched )
else
post_fix.pop(); // pop the (
in_fix.pop(); // pop the )
}
Try this way:
//opening Parenthesis
if (in_fix.peek().type == 4) {
post_fix.push(in_fix.pop());
}
//closing Parenthesis
if(in_fix.peek().type == 5){
//Till opening parenthesis encountered in stack, append operators to postfix. and pop parenthesis and do not append to post_fix.
while(post_fix.peek().type!=4){
postfixstr.append(post_fix.pop());
}
//finally pop left parenthesis from post_fix stack.
post_fix.pop();
}
I"m trying to take a string that represents a full algebraic excpression, such as x = 15 * 6 / 3 which is a string, and tokenize it into its individual components. So the first would be x, then =, then 15, then *, 6, / and finally 3.
The problem I am having is actually parsing through the string and looking at the individual characters. I can't think of a way to do this without a massive amount of if statements. Surely there has to be a better way tan specifically defining each individual case and testing for it.
For each type of token, you'll want to figure out how to identify:
when you're starting to read a particular token
if you're continuing to read the same token, or if you've started a different one
Let's take your example: x=15*6/3. Let's assume that you cannot rely on the fact that there are spaces in between each token. In that case, it's trivial: your new token starts when you reach a space.
You can break down the character types into letters, digits, and symbols. Let's call the token types Variable, Operator, and Number.
A letter indicates a Variable token has started. It continues until you read a non-letter.
A symbol indicates the start of an Operator token. I only see single symbols, but you can have groups of symbols correspond to different Operator tokens.
A digit indicates the start of a Number token. (Let's assume integers for now.) The Number token continues until you read a non-digit.
Basically, that's how a simple symbolic parser works. Now, if you add in negative numbers (where the '-' symbol can have multiple meanings), or parentheses, or function names (like sin(x)) then things get more complicated, but it amounts to the same set of rules, now just with more choices.
create regular expression for each possible element: integer, variable, operator, parentheses.
combine them using the | regular expression operator into one big regular expression with capture groups to identify which one matched.
in a loop match the head of the remaining string and break off the matched part as a token. the type of the token depends on which sub-expression matched as described in 2.
or
use a lexer library, such as the one in antlr or javacc
This is from my early expression evaluator that takes an infix expression like yours and turns it into postfix to evaluate. There are methods that help the parser but I think they're pretty self documenting. Mine uses symbol tables to check tokens against. It also allows for user defined symbols and nested assignments and other things you may not need/want. But it shows how I handled your issue without using niceties like regex which would simplify this task tremendously. In addition everything shown is of my own implementation - stack and queue as well - everything. So if anything looks abnormal (unlike Java imps) that's because it is.
This section of code is important not to answer your immediate question but to show the necessary work to determine the type of token you're dealing with. In my case I had three different types of operators and two different types of operands. Based on either the known rules or rules I chose to enforce (when appropriate) it was easy to know when something was a number (starts with a number), variable/user symbol/math function (starts with a letter), or math operator (is: /,*,-,+) . Note that it only takes seeing the first char to know the correct extraction rules. From your example, if all your cases are as simple, you'd only have to handle two types, operator or operand. Nonetheless the same logic will apply.
protected Queue<Token> inToPostParse(String exp) {
// local vars
inputExp = exp;
offset = 0;
strLength = exp.length();
String tempHolder = "";
char c;
// the program runs in a loop so make sure you're dealing
// with an empty queue
q1.reset();
for (int i = offset; tempHolder != null && i < strLength; ++i) {
c = exp.charAt(i);
// Spaces are useless so skip them
if (c == ' ') { continue; }
// If c is a letter
if ((c >= 'A' && c <= 'Z')
|| (c >= 'a' && c <= 'z')) {
// Here we know it must be a user symbol possibly undefined
// at this point or an function like SIN, ABS, etc
// We extract, based on obvious rules, the op
tempHolder = extractPhrase(i); // Used to be append sequence
if (ut.isTrigOp(tempHolder) || ut.isAdditionalOp(tempHolder)) {
s1.push(new Operator(tempHolder, "Function"));
} else {
// If not some math function it is a user defined symbol
q1.insert(new Token(tempHolder, "User"));
}
i += tempHolder.length() - 1;
tempHolder = "";
// if c begins with a number
} else if (c >= '0' && c <= '9') {
try {
// Here we know that it must be a number
// so we extract until we reach a non number
tempHolder = extractNumber(i);
q1.insert(new Token(tempHolder, "Number"));
i += tempHolder.length() - 1;
tempHolder = "";
}
catch (NumberFormatException nfe) {
return null;
}
// if c is in the math symbol table
} else if (ut.isMathOp(String.valueOf(c))) {
String C = String.valueOf(c);
try {
// This is where the magic happens
// Here we determine the "intersection" of the
// current C and the top of the stack
// Based on the intersection we take action
// i.e., in math do you want to * or + first?
// Depending on the state you may have to move
// some tokens to the queue before pushing onto the stack
takeParseAction(C, ut.findIntersection
(C, s1.showTop().getSymbol()));
}
catch (NullPointerException npe) {
s1(C);
}
// it must be an invalid expression
} else {
return null;
}
}
u2();
s1.reset();
return q1;
}
Basically I have a stack (s1) and a queue (q1). All variables or numbers go into the queue. Any operators trig, math, parens, etc.. go on the stack. If the current token is to be put on the stack you have to check the state (top) to determine what parsing action to take (i.e., what to do based on math precedence). Sorry if this seems like useless information. I imagine if you're parsing a math expression it's because at some point you plan to evaluate it. IMHO, postfix is the easiest so I, regardless of input format, change it to post and evaluate with one method. If your O is different - do what you like.
Edit: Implementations
The extract phrase and number methods, which you may be most interested in, are as follows:
protected String extractPhrase(int it) {
String phrase = new String();
char c;
for ( ; it < inputExp.length(); ++it) {
c = inputExp.charAt(it);
if ((c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z')
|| (c >= '0' && c <= '9')) {
phrase += String.valueOf(c);
} else {
break;
}
}
return phrase;
}
protected String extractNumber(int it) throws NumberFormatException {
String number = new String();
int decimals = 0;
char c;
for ( ; it < strLength; ++it) {
c = inputExp.charAt(it);
if (c >= '0' && c <= '9') {
number += String.valueOf(c);
} else if (c == '.') {
++decimals;
if (decimals < 2) {
number += ".";
} else {
throw new NumberFormatException();
}
} else {
break;
}
}
return number;
}
Remember - By the time they enter these methods I've already been able to deduce what type it is. This allows you to avoid the seemingly endless while-if-else chain.
Are components always separated by space character like in your question? if so, use algebricExpression.split(" ") to get a String[] of components.
If no such restrictions can be assumed, a possible solution can be to iterate over the input, and switch the Character.getType() of the current index, somthing like that:
ArrayList<String> getExpressionComponents(String exp) {
ArrayList<String> components = new ArrayList<String>();
String current = "";
int currentSequenceType = Character.UNASSIGNED;
for (int i = 0 ; i < exp.length() ; i++) {
if (currentSequenceType != Character.getType(exp.charAt(i))) {
if (current.length() > 0) components.add(current);
current = "";
currentSequenceType = Character.getType(exp.charAt(i));
}
switch (Character.getType(exp.charAt(i))) {
case Character.DECIMAL_DIGIT_NUMBER:
case Character.MATH_SYMBOL:
case Character.START_PUNCTUATION:
case Character.END_PUNCTUATION:
case Character.LOWERCASE_LETTER:
case Character.UPPERCASE_LETTER:
// add other required types
current = current.concat(new String(new char[] {exp.charAt(i)}));
currentSequenceType = Character.getType(exp.charAt(i));
break;
default:
current = "";
currentSequenceType = Character.UNASSIGNED;
break;
}
}
return components;
}
You can easily change the cases to meet with other requirements, such as split non-digit chars to separate components etc.
This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Can regular expressions be used to match nested patterns?
I am writing a regexp to check if the input string is a correct arithmetic expression. The problem is checking if there are enough opening and closing parentheses.
Expressions:
(1)
(((1)
((1))))
I think lookahead and lookbehind are useful here but for now I could check only one kind. I'm using Java, if it matters.
You shouldn't use regular expression to do this. Instead you can iterate over the string character by character and keep track of the nesting level.
Initially the nesting is 0. When you see a ( increase the nesting by 1, and when you see ) decrease the nesting. The expression is correctly balanced if the final nesting is 0 and the nesting never goes below 0.
public static boolean checkParentheses(String s) {
int nesting = 0;
for (int i = 0; i < s.length(); ++i)
{
char c = s.charAt(i);
switch (c) {
case '(':
nesting++;
break;
case ')':
nesting--;
if (nesting < 0) {
return false;
}
break;
}
}
return nesting == 0;
}
You need to be using a parser to do this, not a regex. See this question.
Why not just count the opening and closing parens like so?
String expression = "((1+x) - 3 * 4(6*9(12+1)(4+(2*3+(4-4)))))";
int open = 0;
for(int x = 0; x < open; x++){
if(expression[x] == '(')
open++;
else if(expression[x] == ')')
open--;
}
if (open != 0)
// Not a valid expression
Of course this only checks that you have the right amount - someone could write '))3*4((' and it would be validated using this method.