I'm working on a project in Java that requires me to convert an infix expression to a postfix expression. I am currently able to convert infix expressions to postfix with this method as long as they don't contain parenthesis, but I can't figure out how to handle parenthesis.
Basically, I have two stacks that hold objects that are called 'Token'. A Token is a wrapper class that holds a string that is either a number, variable (which gets evaluated as a number, pending on user input), operator (the operator has a priority level associated with it so that my method can determine how to handle order of operations between '+', '-', '*' and '/'), or a parenthesis (the parenthesis has a way to determine if it is a open parenthesis or a closed parenthesis).
How should I handle parenthesis? What about multiple layers of parenthesis?
public String toPostFix() {
StringBuilder postfixstr = new StringBuilder();
Stack<Token> in_fix = new Stack<>();
Stack<Token> post_fix = new Stack<>();
for (int i = tokens.length - 1; i >= 0; i--) {
t = new Token(tokens[i]);
in_fix.push(t);
}
//there are still tokens to process
while (!in_fix.empty()) {
//is a number
if (in_fix.peek().type == 1) {
postfixstr.append(in_fix.pop().toString());
}
//is an operator and the stack is empty
else if (in_fix.peek().type == 3 && post_fix.empty()) {
post_fix.push(in_fix.pop());
}
// is an operator that has higher priority than the operator on the stack
else if (in_fix.peek().type == 3 && in_fix.peek().isOperator() > post_fix.peek().isOperator()) {
post_fix.push(in_fix.pop());
}
// is an operator that has lower priority than the operator on the stack
else if (in_fix.peek().type == 3 && in_fix.peek().isOperator() <= post_fix.peek().isOperator()) {
postfixstr.append(post_fix.pop());
post_fix.push(in_fix.pop());
}
//puts the rest of the stack onto the output string
if (in_fix.empty()) {
while (!post_fix.empty()) {
postfixstr.append(post_fix.pop());
}
}
}
return postfixstr.toString();
}
You need to push the left parenthesis onto the stack, and process the stack like so when you encounter a right parenthesis:
// opening (
if (in_fix.peek().type == 4) {
post_fix.push(in_fix.pop());
}
//closing )
if(in_fix.peek().type == 5){
while(!(post_fix.isEmpty() || post_fix.peek().type == 4)){
postfixstr.append(post_fix.pop());
}
if (post_fix.isEmpty())
; // ERROR - unmatched )
else
post_fix.pop(); // pop the (
in_fix.pop(); // pop the )
}
Try this way:
//opening Parenthesis
if (in_fix.peek().type == 4) {
post_fix.push(in_fix.pop());
}
//closing Parenthesis
if(in_fix.peek().type == 5){
//Till opening parenthesis encountered in stack, append operators to postfix. and pop parenthesis and do not append to post_fix.
while(post_fix.peek().type!=4){
postfixstr.append(post_fix.pop());
}
//finally pop left parenthesis from post_fix stack.
post_fix.pop();
}
Related
I was doing hackerrank and I am trying to understand the solution written by RodneyShag. (Credit: He wrote the solution, not me) I am trying to understand the last part.
import java.util.Scanner;
import java.util.HashMap;
import java.util.ArrayDeque;
class Solution {
public static void main(String[] args) {
/* Create HashMap to match opening brackets with closing brackets */
HashMap<Character, Character> map = new HashMap<>();
map.put('(', ')');
map.put('[', ']');
map.put('{', '}');
/* Test each expression for validity */
Scanner scan = new Scanner(System.in);
while (scan.hasNext()) {
String expression = scan.next();
System.out.println(isBalanced(expression, map) ? "true" : "false" );
}
scan.close();
}
private static boolean isBalanced(String expression, HashMap<Character, Character> map) {
if ((expression.length() % 2) != 0) {
return false; // odd length Strings are not balanced
}
ArrayDeque<Character> deque = new ArrayDeque<>(); // use deque as a stack
for (int i = 0; i < expression.length(); i++) {
Character ch = expression.charAt(i);
if (map.containsKey(ch)) {
deque.push(ch);
} else if (deque.isEmpty() || ch != map.get(deque.pop())) {
return false;
}
}
return deque.isEmpty();
}
}
The explanation (provided by him) is
Our map only has 3 keys: (, [, { The linemap.containsKey(ch) checks if it's one of the above keys, and if so, pushes it to the deque. The next part of
deque.isEmpty() || ch != map.get(deque.pop())
checks if we have a valid expression. Since at this point, we know the character is not (, [, or {, so we must have a valid closing brace. if
1) our deque is empty, and we just read a closing brace, then we have an invalid expression (and return false)
2) if the closing brace does not match the opening brace that we popped off the deque, then we have an invalid expression (and return false)
I understand that
Character ch = expression.charAt(i);
is supposed to : check whether each variable at expression is = to variable in map Character.
Why is it only ([{ in map? Isn't there ( ) [ ] { } in map?
The map is use to specify which character is the closing bracket given an opening bracket. So when you write
map.get('(')
you get the character ) as defined with
map.put('(', ')');
at the initialization.
What the ch != map.get(deque.pop()) line is checking is if the character ch is an expected closing bracket based on the top value of the stack. To make it more clear/verbose the else if() part can be rewritten as this:
if (map.containsKey(ch)) {
deque.push(ch);
} else {
// 'ch' must be one of ), ] or } (or something completely different) at this point
if (deque.isEmpty()) {
// stack is empty, but shouldn't, so return false
return false;
}
Character lastOpeningBracket = deque.pop(); // is one of (, [ or {
Character expectedClosingBracket = map.get(lastOpeningBracket); // is one of ), ] or }
Character actualReadCharacter = ch; // only for verbosity
if (!expectedClosingBracket.equals(actualReadCharacter)) {
System.out.println("The character "+actualReadCharacter+" was read from\n"+
"the string, but it should be\n"+
"a "+expectedClosingBracket+" character\n"+
"because the last opening bracket was\n"+
"a "+lastOpeningBracket); // only for verbosity
return false;
}
}
(be careful for comparing char and Character, see #matt's comment and What is the difference between == and equals() in Java?. Also check the Character cache which might be used here)
This question already has answers here:
Recursive expression evaluator using Java
(6 answers)
Closed 7 years ago.
I need to make a java program that evaluates an expression from an input file and returns the result in an output file. It needs to consider operator precedence, unary and binary operators, bracket matching, and has to rely on recursion only (no stacks or queues).
I've been thinking about this all night, and it frustrates me. I'm not asking for an entire java program written for me. I just need some guidance. I started by writing some pseudo-code, but I don't think it's any good:
Input: the text file to read each expression from.
Output: the text file that repeats each expression, as well as printing the result.
Algorithm SecondCalc()
{
input = “expressions.txt”;
output = “out.txt”;
if (input.currentLine has something)
{
line = input.currentLine;
output.write(line);
line = line.replace(“-space-”, “”);
evaluate(line);
//...to be continued
}
}
Algorithm evaluate(line)
{
for(i = 0 to line.length)
{
if(i == “(” or “)” ) exit loop;
if(i == “!”) exit loop;
if(i == “^”) exit loop;
if(i == “*” or “/” ) exit loop;
if(i == “+” or “-” ) exit loop;
if(i == “>” or “>=” or “<” or “<=” ) exit loop;
if(i == “==” or “!=” ) exit loop;
if(i == “$”) exit loop;
}
temp1 = line from index 0 to i;
temp2 = line from index i + 1 to line.length;
if(i == “!”) then evaulate(temp1!);
//...to be continued
}
Any tips would be appreciated. Thanks.
well the first thing I notice is that you say want operator precedence but in your evaluate you ignore operator precedence by essentially doing first come first serve which treats them all with the same precedence. If your aim indeed is to simulate operator precedence (i assume the input is expected to look like java's expressions) then i suggest you either properly process certain operators first before you process others, or you re-arrange the input properly to match other styles like polish notation.
For both cases, i would do a similar process: instead of if statement after if statement in the for loop like you have now, try for loop after for loop where each for loop looks for a specific operator and "does something".
for(i = 0 to line.length)
{
if(i == “(” or “)” ) doSomething;
}
for(i = 0 to line.length)
{
if(i == “!” ) doSomething;
}
for(i = 0 to line.length)
{
if(i == “^”) doSomething;
}
for(i = 0 to line.length)
{
if(i == “*” or “/” ) doSomething;
}
for(i = 0 to line.length)
{
if(i == “+” or “-” ) doSomething;
}
for(i = 0 to line.length)
{
if(i == “>” or “>=” or “<” or “<=” ) doSomething;
}
for(i = 0 to line.length)
{
if(i == “==” or “!=” )doSomething;
}
for(i = 0 to line.length)
{
if(i == “$”)doSomething;
}
}
There's much to improve on, but hopefully this points you in the right direction.
I would suggest reading up on polish notation. It's a good way to store mathematical functions. For instance + cos a b -> cos(a) + b whereas cos + a b -> cos(a+b). There is no ambiguity. Additionally, terms to the right have precedence over terms to the left.
I wrote a symbolic logic manipulator long ago, and reading the strings is definitely hard. Here is what I would suggest for flow:
Look for binary operators that are outside of any parentheses. Start at the highest order of operations and work down.
When you find a binary operator, recursively call the stringtofunction on the arguments to either side of the binary operator. Anything between the binary operator you are looking at and any other binary operator outside parentheses or between the binary operator and the ends of the string counts as 1 object.
The return from part two goes into something like operator return1 return2 in polish notation.
When the outermost sides of the string are parentheses peel them off.
If you did not find any top level binary operators search for unary operators. Recursively call the argument of the unary operator and store it as operator return;
I am writing a program that evaluates a LISP expression through iteration. LISP expressions are as follows:
Adding two and two would be written in LISP as: (+ 2 2). The LISP expression (* 5 4 3 2 1) would be evaluated to five factorial.
To do this, I am using a Stack of Queues of Doubles. A string is input to the evaluator, I take each item in the string, evaluate whether it is a operator or operand. When I reach a '(' I need to push the current level queue onto the stack and instantiate a new queue to continue evaluating. If I reach a ')' I need to take the operator from the current level queue, and then evaluate each operand within that queue until it is empty, at which point I offer the newly evaluated operand to the queue next in the stack (By popping it, offering the operand, and pushing it back on).
My issue seems to be arising when I reach ')' and try to evaluate a current level operand with the current operator. I have been trying:
operand = operator + opQueue.poll();
but this just adds the double value of the operator to the operand... :( I know I am missing something relatively basic here, but any advice or suggestions would be much appreciated. The full code is below. I believe the problem is towards the end just before main. I included all the code for attempted clarity.
import java.util.Queue;
import java.util.LinkedList;
import java.util.Stack;
public class IterativeEvaluator
{
private ExpressionScanner expression;
public IterativeEvaluator (String expression)
{
this.expression = new ExpressionScanner(expression);
}
public double evaluate(Queue<Double> operandQueue)
{
Stack<Queue<Double>> myStack = new Stack<Queue<Double>>();
char operator = ' ';
double operand = 0.0;
Queue<Double> opQueue = operandQueue;
// write your code here to evaluate the LISP expression iteratively
// you will need to use an explicit stack to push and pop context objects
while(expression.hasNextOperand() || expression.hasNextOperator())
{
if(expression.hasNextOperand())
{
operand = expression.nextOperand();
opQueue.offer((double)operand);
}
if(expression.hasNextOperator())
{
operator = expression.nextOperator();
if(operator == '(')
{
myStack.push(opQueue);
opQueue = new LinkedList<Double>();
}
if(operator != '(' && operator != ')')
opQueue.offer((double)operator);
if(operator == ')')
{
operator = ((char)(opQueue.remove().intValue()));
while(opQueue.peek() != null)
{
operand = operator + opQueue.poll();
}
opQueue = myStack.pop();
if(opQueue != null)
opQueue.offer(operand);
}
}
}
return operand;
}
public static void main(String [] args)
{
String s =
"(+\t(- 6)\n\t(/\t(+ 3)\n\t\t(- \t(+ 1 1)\n\t\t\t3\n\t\t\t1)\n\t\t(*))\n\t(* 2 3 4))"; // = 16.5
IterativeEvaluator myEvaluator = new IterativeEvaluator(s);
System.out.println("Evaluating LISP Expression:\n" + s);
System.out.println("Value is: " + myEvaluator.evaluate(null));
}
} /* 201340 */
Here is an improved version of your code with some remarks. I hope, it helps. You only have to extend it if you want to have more different operators.
public double evaluate(Queue<Double> operandQueue)
{
// from http://docs.oracle.com/javase/7/docs/api/java/util/ArrayDeque.html
// "This class is likely to be faster than Stack when used as a stack, ..."
ArrayDeque<Queue<Double>> myStack = new ArrayDeque<>();
char operator; // don't pre-initialize with nonsense value
double operand = Double.NaN; // not used, NaN indicates if we have an error
Queue<Double> opQueue = operandQueue;
if(!expression.hasNextOperand() && !expression.hasNextOperand())
// carefully decide what to do if the entire expression is empty
throw new IllegalArgumentException("empty expression");
// write your code here to evaluate the LISP expression iteratively
// you will need to use an explicit stack to push and pop context objects
while(expression.hasNextOperand() || expression.hasNextOperator())
{
if(expression.hasNextOperand())
{
operand = expression.nextOperand();
opQueue.offer(operand); // cast unnecessary
}
else // expression.hasNextOperator() is implied here
{
operator = expression.nextOperator();
if(operator == '(')
{
myStack.push(opQueue);
opQueue = new LinkedList<Double>();
}
else if(operator != ')') // using <else> we know operator!='('
opQueue.offer((double)operator);
else // using <else> we know operator==')'
{
operator = ((char)(opQueue.remove().intValue()));
// get the first operand, using 0.0 here would be disastrous
// e.g. for multiplications
operand = opQueue.poll();
while(opQueue.peek() != null)
{
switch(operator)
{
case '+': operand += opQueue.poll(); break;
case '*': operand *= opQueue.poll(); break;
// you got the idea ...
}
}
opQueue = myStack.pop();
if(opQueue != null)
opQueue.offer(operand);
}
}
}
return operand;
}
I"m trying to take a string that represents a full algebraic excpression, such as x = 15 * 6 / 3 which is a string, and tokenize it into its individual components. So the first would be x, then =, then 15, then *, 6, / and finally 3.
The problem I am having is actually parsing through the string and looking at the individual characters. I can't think of a way to do this without a massive amount of if statements. Surely there has to be a better way tan specifically defining each individual case and testing for it.
For each type of token, you'll want to figure out how to identify:
when you're starting to read a particular token
if you're continuing to read the same token, or if you've started a different one
Let's take your example: x=15*6/3. Let's assume that you cannot rely on the fact that there are spaces in between each token. In that case, it's trivial: your new token starts when you reach a space.
You can break down the character types into letters, digits, and symbols. Let's call the token types Variable, Operator, and Number.
A letter indicates a Variable token has started. It continues until you read a non-letter.
A symbol indicates the start of an Operator token. I only see single symbols, but you can have groups of symbols correspond to different Operator tokens.
A digit indicates the start of a Number token. (Let's assume integers for now.) The Number token continues until you read a non-digit.
Basically, that's how a simple symbolic parser works. Now, if you add in negative numbers (where the '-' symbol can have multiple meanings), or parentheses, or function names (like sin(x)) then things get more complicated, but it amounts to the same set of rules, now just with more choices.
create regular expression for each possible element: integer, variable, operator, parentheses.
combine them using the | regular expression operator into one big regular expression with capture groups to identify which one matched.
in a loop match the head of the remaining string and break off the matched part as a token. the type of the token depends on which sub-expression matched as described in 2.
or
use a lexer library, such as the one in antlr or javacc
This is from my early expression evaluator that takes an infix expression like yours and turns it into postfix to evaluate. There are methods that help the parser but I think they're pretty self documenting. Mine uses symbol tables to check tokens against. It also allows for user defined symbols and nested assignments and other things you may not need/want. But it shows how I handled your issue without using niceties like regex which would simplify this task tremendously. In addition everything shown is of my own implementation - stack and queue as well - everything. So if anything looks abnormal (unlike Java imps) that's because it is.
This section of code is important not to answer your immediate question but to show the necessary work to determine the type of token you're dealing with. In my case I had three different types of operators and two different types of operands. Based on either the known rules or rules I chose to enforce (when appropriate) it was easy to know when something was a number (starts with a number), variable/user symbol/math function (starts with a letter), or math operator (is: /,*,-,+) . Note that it only takes seeing the first char to know the correct extraction rules. From your example, if all your cases are as simple, you'd only have to handle two types, operator or operand. Nonetheless the same logic will apply.
protected Queue<Token> inToPostParse(String exp) {
// local vars
inputExp = exp;
offset = 0;
strLength = exp.length();
String tempHolder = "";
char c;
// the program runs in a loop so make sure you're dealing
// with an empty queue
q1.reset();
for (int i = offset; tempHolder != null && i < strLength; ++i) {
c = exp.charAt(i);
// Spaces are useless so skip them
if (c == ' ') { continue; }
// If c is a letter
if ((c >= 'A' && c <= 'Z')
|| (c >= 'a' && c <= 'z')) {
// Here we know it must be a user symbol possibly undefined
// at this point or an function like SIN, ABS, etc
// We extract, based on obvious rules, the op
tempHolder = extractPhrase(i); // Used to be append sequence
if (ut.isTrigOp(tempHolder) || ut.isAdditionalOp(tempHolder)) {
s1.push(new Operator(tempHolder, "Function"));
} else {
// If not some math function it is a user defined symbol
q1.insert(new Token(tempHolder, "User"));
}
i += tempHolder.length() - 1;
tempHolder = "";
// if c begins with a number
} else if (c >= '0' && c <= '9') {
try {
// Here we know that it must be a number
// so we extract until we reach a non number
tempHolder = extractNumber(i);
q1.insert(new Token(tempHolder, "Number"));
i += tempHolder.length() - 1;
tempHolder = "";
}
catch (NumberFormatException nfe) {
return null;
}
// if c is in the math symbol table
} else if (ut.isMathOp(String.valueOf(c))) {
String C = String.valueOf(c);
try {
// This is where the magic happens
// Here we determine the "intersection" of the
// current C and the top of the stack
// Based on the intersection we take action
// i.e., in math do you want to * or + first?
// Depending on the state you may have to move
// some tokens to the queue before pushing onto the stack
takeParseAction(C, ut.findIntersection
(C, s1.showTop().getSymbol()));
}
catch (NullPointerException npe) {
s1(C);
}
// it must be an invalid expression
} else {
return null;
}
}
u2();
s1.reset();
return q1;
}
Basically I have a stack (s1) and a queue (q1). All variables or numbers go into the queue. Any operators trig, math, parens, etc.. go on the stack. If the current token is to be put on the stack you have to check the state (top) to determine what parsing action to take (i.e., what to do based on math precedence). Sorry if this seems like useless information. I imagine if you're parsing a math expression it's because at some point you plan to evaluate it. IMHO, postfix is the easiest so I, regardless of input format, change it to post and evaluate with one method. If your O is different - do what you like.
Edit: Implementations
The extract phrase and number methods, which you may be most interested in, are as follows:
protected String extractPhrase(int it) {
String phrase = new String();
char c;
for ( ; it < inputExp.length(); ++it) {
c = inputExp.charAt(it);
if ((c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z')
|| (c >= '0' && c <= '9')) {
phrase += String.valueOf(c);
} else {
break;
}
}
return phrase;
}
protected String extractNumber(int it) throws NumberFormatException {
String number = new String();
int decimals = 0;
char c;
for ( ; it < strLength; ++it) {
c = inputExp.charAt(it);
if (c >= '0' && c <= '9') {
number += String.valueOf(c);
} else if (c == '.') {
++decimals;
if (decimals < 2) {
number += ".";
} else {
throw new NumberFormatException();
}
} else {
break;
}
}
return number;
}
Remember - By the time they enter these methods I've already been able to deduce what type it is. This allows you to avoid the seemingly endless while-if-else chain.
Are components always separated by space character like in your question? if so, use algebricExpression.split(" ") to get a String[] of components.
If no such restrictions can be assumed, a possible solution can be to iterate over the input, and switch the Character.getType() of the current index, somthing like that:
ArrayList<String> getExpressionComponents(String exp) {
ArrayList<String> components = new ArrayList<String>();
String current = "";
int currentSequenceType = Character.UNASSIGNED;
for (int i = 0 ; i < exp.length() ; i++) {
if (currentSequenceType != Character.getType(exp.charAt(i))) {
if (current.length() > 0) components.add(current);
current = "";
currentSequenceType = Character.getType(exp.charAt(i));
}
switch (Character.getType(exp.charAt(i))) {
case Character.DECIMAL_DIGIT_NUMBER:
case Character.MATH_SYMBOL:
case Character.START_PUNCTUATION:
case Character.END_PUNCTUATION:
case Character.LOWERCASE_LETTER:
case Character.UPPERCASE_LETTER:
// add other required types
current = current.concat(new String(new char[] {exp.charAt(i)}));
currentSequenceType = Character.getType(exp.charAt(i));
break;
default:
current = "";
currentSequenceType = Character.UNASSIGNED;
break;
}
}
return components;
}
You can easily change the cases to meet with other requirements, such as split non-digit chars to separate components etc.
This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Can regular expressions be used to match nested patterns?
I am writing a regexp to check if the input string is a correct arithmetic expression. The problem is checking if there are enough opening and closing parentheses.
Expressions:
(1)
(((1)
((1))))
I think lookahead and lookbehind are useful here but for now I could check only one kind. I'm using Java, if it matters.
You shouldn't use regular expression to do this. Instead you can iterate over the string character by character and keep track of the nesting level.
Initially the nesting is 0. When you see a ( increase the nesting by 1, and when you see ) decrease the nesting. The expression is correctly balanced if the final nesting is 0 and the nesting never goes below 0.
public static boolean checkParentheses(String s) {
int nesting = 0;
for (int i = 0; i < s.length(); ++i)
{
char c = s.charAt(i);
switch (c) {
case '(':
nesting++;
break;
case ')':
nesting--;
if (nesting < 0) {
return false;
}
break;
}
}
return nesting == 0;
}
You need to be using a parser to do this, not a regex. See this question.
Why not just count the opening and closing parens like so?
String expression = "((1+x) - 3 * 4(6*9(12+1)(4+(2*3+(4-4)))))";
int open = 0;
for(int x = 0; x < open; x++){
if(expression[x] == '(')
open++;
else if(expression[x] == ')')
open--;
}
if (open != 0)
// Not a valid expression
Of course this only checks that you have the right amount - someone could write '))3*4((' and it would be validated using this method.