This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Can regular expressions be used to match nested patterns?
I am writing a regexp to check if the input string is a correct arithmetic expression. The problem is checking if there are enough opening and closing parentheses.
Expressions:
(1)
(((1)
((1))))
I think lookahead and lookbehind are useful here but for now I could check only one kind. I'm using Java, if it matters.
You shouldn't use regular expression to do this. Instead you can iterate over the string character by character and keep track of the nesting level.
Initially the nesting is 0. When you see a ( increase the nesting by 1, and when you see ) decrease the nesting. The expression is correctly balanced if the final nesting is 0 and the nesting never goes below 0.
public static boolean checkParentheses(String s) {
int nesting = 0;
for (int i = 0; i < s.length(); ++i)
{
char c = s.charAt(i);
switch (c) {
case '(':
nesting++;
break;
case ')':
nesting--;
if (nesting < 0) {
return false;
}
break;
}
}
return nesting == 0;
}
You need to be using a parser to do this, not a regex. See this question.
Why not just count the opening and closing parens like so?
String expression = "((1+x) - 3 * 4(6*9(12+1)(4+(2*3+(4-4)))))";
int open = 0;
for(int x = 0; x < open; x++){
if(expression[x] == '(')
open++;
else if(expression[x] == ')')
open--;
}
if (open != 0)
// Not a valid expression
Of course this only checks that you have the right amount - someone could write '))3*4((' and it would be validated using this method.
Related
I am getting the "Must be an array type but it resolved to string" error in my code. It also says that i (in the code below) cannot be resolved to a variable which I don't get.
public class DNAcgcount{
public double ratio(String dna){
int count=0;
for (int i=0;i<dna.length();i++);
if (dna[i]== "c"){
count+= 1;
if (dna[i]=="g"){
count+=1;
double answer = count/dna.length();
return answer;
}
}
}
}
Could you guys please help me figure out where the problem lies? I'm new to coding in Java so I am not entirely comfortable with the format yet.
Thanks a lot,
Junaid
You cannot access a String's character using subscript (dna[i]). Use charAt instead:
dna.charAt(i) == 'c'
Also, "c" is a String, 'c' is a char.
One more thing - integer division ( e.g. int_a / int_b ) results in an int, and so you lose accuracy, instead - cast one of the ints to double:
double answer = count/(double)dna.length();
Use {} to define the scope of the loop. Also, as others already pointed out, use charAt instead of [] and use ' for characters, and use floating point division for the ratio.
for (int i = 0; i < dna.length(); i++) {
if (dna.charAt(i) == 'c') {
count += 1;
}
if (dna.charAt(i) == 'g') {
count += 1;
}
}
Or a bit shorter, use || to or the two clauses together
if (dna.charAt(i) == 'c' || dna.charAt(i) == 'g') {
count += 1;
}
I think you are currently a bit weak at brackets , this is what i understood from your code and corrected it;
public class DNAcgcount{
public double ratio(String dna){
int count=0;
for (int i=0;i<dna.length();i++){
if (dna.charAt(i)== 'c')
count+= 1;
if (dna.charAt(i)=='g')
count+=1;
}
double answer = count/(double)dna.length();
return answer;
}
}
After if we have to close the brackets when what you want in if is finished . I think you wanted count to be the number of time c or g is present in the dna.
You also did some other mistakes like you have to use 'c' and 'g' instead of "c" and "g" if you are using .charAt(i) because it will be treated like a character and then only you can compare .
You may view this link
http://docs.oracle.com/javase/tutorial/java/nutsandbolts/if.html
http://docs.oracle.com/javase/tutorial/java/nutsandbolts/for.html
and you may also have a look at works you can do with string like charAt.
It seems like that you have a few problems with the main syntax of basic java functions like loops or if-else statement. Click here for a good tutorial on these.
You must correct your for-loop and your if-statement:
for(int i=0;i<dna.length();i++){
if(...){
...;
}
if(...){
...;
}
}
Now you wont get the Cant be resolved to a variable... exception.
Second thing is the usage of your string. You have to use it like this:
for(int i=0;i<dna.length();i++){
if(dna.charAt(i) == 'c'){
count += 1;
}
if(dna.charAt(i) == 'g'){
count += 1;
}
}
Now all your exceptions should be eleminated.
Your problem is with syntax dna[i], dna is a string and you access it as it would be an array by []. Use dna.charAt(i); instead.
You using String incorrectly. Instead of accessing via [] use dna.charAt(i).
Altough logically a string is an array of characters in Java a String type is a class (which means it has attributes and methods) and not a typical array.
And if you want to compare a single character to another enclose it with '' instead of "":
if (dna.charAt(i) == 'c')
.
.
There are two errors:
count should be double or should be casted do double answer = (double)count / dna.length();
and as mentioned above you should replace dna[i] with dna.charAt(i)
This question already has answers here:
Recursive expression evaluator using Java
(6 answers)
Closed 7 years ago.
I need to make a java program that evaluates an expression from an input file and returns the result in an output file. It needs to consider operator precedence, unary and binary operators, bracket matching, and has to rely on recursion only (no stacks or queues).
I've been thinking about this all night, and it frustrates me. I'm not asking for an entire java program written for me. I just need some guidance. I started by writing some pseudo-code, but I don't think it's any good:
Input: the text file to read each expression from.
Output: the text file that repeats each expression, as well as printing the result.
Algorithm SecondCalc()
{
input = “expressions.txt”;
output = “out.txt”;
if (input.currentLine has something)
{
line = input.currentLine;
output.write(line);
line = line.replace(“-space-”, “”);
evaluate(line);
//...to be continued
}
}
Algorithm evaluate(line)
{
for(i = 0 to line.length)
{
if(i == “(” or “)” ) exit loop;
if(i == “!”) exit loop;
if(i == “^”) exit loop;
if(i == “*” or “/” ) exit loop;
if(i == “+” or “-” ) exit loop;
if(i == “>” or “>=” or “<” or “<=” ) exit loop;
if(i == “==” or “!=” ) exit loop;
if(i == “$”) exit loop;
}
temp1 = line from index 0 to i;
temp2 = line from index i + 1 to line.length;
if(i == “!”) then evaulate(temp1!);
//...to be continued
}
Any tips would be appreciated. Thanks.
well the first thing I notice is that you say want operator precedence but in your evaluate you ignore operator precedence by essentially doing first come first serve which treats them all with the same precedence. If your aim indeed is to simulate operator precedence (i assume the input is expected to look like java's expressions) then i suggest you either properly process certain operators first before you process others, or you re-arrange the input properly to match other styles like polish notation.
For both cases, i would do a similar process: instead of if statement after if statement in the for loop like you have now, try for loop after for loop where each for loop looks for a specific operator and "does something".
for(i = 0 to line.length)
{
if(i == “(” or “)” ) doSomething;
}
for(i = 0 to line.length)
{
if(i == “!” ) doSomething;
}
for(i = 0 to line.length)
{
if(i == “^”) doSomething;
}
for(i = 0 to line.length)
{
if(i == “*” or “/” ) doSomething;
}
for(i = 0 to line.length)
{
if(i == “+” or “-” ) doSomething;
}
for(i = 0 to line.length)
{
if(i == “>” or “>=” or “<” or “<=” ) doSomething;
}
for(i = 0 to line.length)
{
if(i == “==” or “!=” )doSomething;
}
for(i = 0 to line.length)
{
if(i == “$”)doSomething;
}
}
There's much to improve on, but hopefully this points you in the right direction.
I would suggest reading up on polish notation. It's a good way to store mathematical functions. For instance + cos a b -> cos(a) + b whereas cos + a b -> cos(a+b). There is no ambiguity. Additionally, terms to the right have precedence over terms to the left.
I wrote a symbolic logic manipulator long ago, and reading the strings is definitely hard. Here is what I would suggest for flow:
Look for binary operators that are outside of any parentheses. Start at the highest order of operations and work down.
When you find a binary operator, recursively call the stringtofunction on the arguments to either side of the binary operator. Anything between the binary operator you are looking at and any other binary operator outside parentheses or between the binary operator and the ends of the string counts as 1 object.
The return from part two goes into something like operator return1 return2 in polish notation.
When the outermost sides of the string are parentheses peel them off.
If you did not find any top level binary operators search for unary operators. Recursively call the argument of the unary operator and store it as operator return;
I"m trying to take a string that represents a full algebraic excpression, such as x = 15 * 6 / 3 which is a string, and tokenize it into its individual components. So the first would be x, then =, then 15, then *, 6, / and finally 3.
The problem I am having is actually parsing through the string and looking at the individual characters. I can't think of a way to do this without a massive amount of if statements. Surely there has to be a better way tan specifically defining each individual case and testing for it.
For each type of token, you'll want to figure out how to identify:
when you're starting to read a particular token
if you're continuing to read the same token, or if you've started a different one
Let's take your example: x=15*6/3. Let's assume that you cannot rely on the fact that there are spaces in between each token. In that case, it's trivial: your new token starts when you reach a space.
You can break down the character types into letters, digits, and symbols. Let's call the token types Variable, Operator, and Number.
A letter indicates a Variable token has started. It continues until you read a non-letter.
A symbol indicates the start of an Operator token. I only see single symbols, but you can have groups of symbols correspond to different Operator tokens.
A digit indicates the start of a Number token. (Let's assume integers for now.) The Number token continues until you read a non-digit.
Basically, that's how a simple symbolic parser works. Now, if you add in negative numbers (where the '-' symbol can have multiple meanings), or parentheses, or function names (like sin(x)) then things get more complicated, but it amounts to the same set of rules, now just with more choices.
create regular expression for each possible element: integer, variable, operator, parentheses.
combine them using the | regular expression operator into one big regular expression with capture groups to identify which one matched.
in a loop match the head of the remaining string and break off the matched part as a token. the type of the token depends on which sub-expression matched as described in 2.
or
use a lexer library, such as the one in antlr or javacc
This is from my early expression evaluator that takes an infix expression like yours and turns it into postfix to evaluate. There are methods that help the parser but I think they're pretty self documenting. Mine uses symbol tables to check tokens against. It also allows for user defined symbols and nested assignments and other things you may not need/want. But it shows how I handled your issue without using niceties like regex which would simplify this task tremendously. In addition everything shown is of my own implementation - stack and queue as well - everything. So if anything looks abnormal (unlike Java imps) that's because it is.
This section of code is important not to answer your immediate question but to show the necessary work to determine the type of token you're dealing with. In my case I had three different types of operators and two different types of operands. Based on either the known rules or rules I chose to enforce (when appropriate) it was easy to know when something was a number (starts with a number), variable/user symbol/math function (starts with a letter), or math operator (is: /,*,-,+) . Note that it only takes seeing the first char to know the correct extraction rules. From your example, if all your cases are as simple, you'd only have to handle two types, operator or operand. Nonetheless the same logic will apply.
protected Queue<Token> inToPostParse(String exp) {
// local vars
inputExp = exp;
offset = 0;
strLength = exp.length();
String tempHolder = "";
char c;
// the program runs in a loop so make sure you're dealing
// with an empty queue
q1.reset();
for (int i = offset; tempHolder != null && i < strLength; ++i) {
c = exp.charAt(i);
// Spaces are useless so skip them
if (c == ' ') { continue; }
// If c is a letter
if ((c >= 'A' && c <= 'Z')
|| (c >= 'a' && c <= 'z')) {
// Here we know it must be a user symbol possibly undefined
// at this point or an function like SIN, ABS, etc
// We extract, based on obvious rules, the op
tempHolder = extractPhrase(i); // Used to be append sequence
if (ut.isTrigOp(tempHolder) || ut.isAdditionalOp(tempHolder)) {
s1.push(new Operator(tempHolder, "Function"));
} else {
// If not some math function it is a user defined symbol
q1.insert(new Token(tempHolder, "User"));
}
i += tempHolder.length() - 1;
tempHolder = "";
// if c begins with a number
} else if (c >= '0' && c <= '9') {
try {
// Here we know that it must be a number
// so we extract until we reach a non number
tempHolder = extractNumber(i);
q1.insert(new Token(tempHolder, "Number"));
i += tempHolder.length() - 1;
tempHolder = "";
}
catch (NumberFormatException nfe) {
return null;
}
// if c is in the math symbol table
} else if (ut.isMathOp(String.valueOf(c))) {
String C = String.valueOf(c);
try {
// This is where the magic happens
// Here we determine the "intersection" of the
// current C and the top of the stack
// Based on the intersection we take action
// i.e., in math do you want to * or + first?
// Depending on the state you may have to move
// some tokens to the queue before pushing onto the stack
takeParseAction(C, ut.findIntersection
(C, s1.showTop().getSymbol()));
}
catch (NullPointerException npe) {
s1(C);
}
// it must be an invalid expression
} else {
return null;
}
}
u2();
s1.reset();
return q1;
}
Basically I have a stack (s1) and a queue (q1). All variables or numbers go into the queue. Any operators trig, math, parens, etc.. go on the stack. If the current token is to be put on the stack you have to check the state (top) to determine what parsing action to take (i.e., what to do based on math precedence). Sorry if this seems like useless information. I imagine if you're parsing a math expression it's because at some point you plan to evaluate it. IMHO, postfix is the easiest so I, regardless of input format, change it to post and evaluate with one method. If your O is different - do what you like.
Edit: Implementations
The extract phrase and number methods, which you may be most interested in, are as follows:
protected String extractPhrase(int it) {
String phrase = new String();
char c;
for ( ; it < inputExp.length(); ++it) {
c = inputExp.charAt(it);
if ((c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z')
|| (c >= '0' && c <= '9')) {
phrase += String.valueOf(c);
} else {
break;
}
}
return phrase;
}
protected String extractNumber(int it) throws NumberFormatException {
String number = new String();
int decimals = 0;
char c;
for ( ; it < strLength; ++it) {
c = inputExp.charAt(it);
if (c >= '0' && c <= '9') {
number += String.valueOf(c);
} else if (c == '.') {
++decimals;
if (decimals < 2) {
number += ".";
} else {
throw new NumberFormatException();
}
} else {
break;
}
}
return number;
}
Remember - By the time they enter these methods I've already been able to deduce what type it is. This allows you to avoid the seemingly endless while-if-else chain.
Are components always separated by space character like in your question? if so, use algebricExpression.split(" ") to get a String[] of components.
If no such restrictions can be assumed, a possible solution can be to iterate over the input, and switch the Character.getType() of the current index, somthing like that:
ArrayList<String> getExpressionComponents(String exp) {
ArrayList<String> components = new ArrayList<String>();
String current = "";
int currentSequenceType = Character.UNASSIGNED;
for (int i = 0 ; i < exp.length() ; i++) {
if (currentSequenceType != Character.getType(exp.charAt(i))) {
if (current.length() > 0) components.add(current);
current = "";
currentSequenceType = Character.getType(exp.charAt(i));
}
switch (Character.getType(exp.charAt(i))) {
case Character.DECIMAL_DIGIT_NUMBER:
case Character.MATH_SYMBOL:
case Character.START_PUNCTUATION:
case Character.END_PUNCTUATION:
case Character.LOWERCASE_LETTER:
case Character.UPPERCASE_LETTER:
// add other required types
current = current.concat(new String(new char[] {exp.charAt(i)}));
currentSequenceType = Character.getType(exp.charAt(i));
break;
default:
current = "";
currentSequenceType = Character.UNASSIGNED;
break;
}
}
return components;
}
You can easily change the cases to meet with other requirements, such as split non-digit chars to separate components etc.
What I'm trying to do is read a line (string) and use it as a mathematical function to get (double) values or answers to it at different points (like a calculator basically)
I included a very simplistic code of what I'm trying to do just for the sake of being direct and straight forward:
double x, y, z;
String function;
x = 5;
y = 4;
function = "(x*y)+y";
z = Double.parseDouble(function);
/*
I want z to equal this
z = (x*y)+y;
*/
System.out.print("z= " + z);
Again, this is only a sample code to be clearer about my question. My question again is: how can I set z = function when z is a double and function is a string?
NOTE: I tried parse as you can see, but it didn't work. I also tried to read the string character by character, but it didn't work either because it added the value of the characters together.
I guess you are looking for a lexer and a parser.
These are basical components of every compiler or interpreter as
the lexer is able to split input (your string) into tokens
the parser is able to build a tree which represent the syntatic shape of your tokens to be furtherly interpretated semantically
This discipline is quite wide and I suggest you to start with something like ANTLR for Java, it is a parser generator that will generate both lexer and parser according to rules you specify through a grammar. There are many, this is just the first that came into my mind.
If you want to forget about all this theory just embed something like JavaScript or Groovy in your Java program, they are able to interpret code that is given at runtime so that you can just go that way.
Java does not have something like eval builtin. But you can use an expression language like spEL, mvel or Jexl for this.
Maybe this SO question can help you.
I suggest you have a look at Parboiled. Unlike nearly all other parser solutions for Java, you write your grammars... In Java.
What is more, among the Java examples, there are working calculators.
float eval(String exp)
{
char[] a = exp.toCharArray();
float[] buffer = new int[exp.length];
int k = 0;
for(int i : a)
{
if(a[i] >= 48 && a[i] <= 57) //checking for numbers
{
int x = a[i] - '0';
buffer[k++] = x;
}
else if(a[i] == '+' || a[i] == '-' || a[i] == '*' || a[i] == '/') //checking for operands
{
float result;
switch(a[i])
{
case '+': result = buffer[k] + buffer[k-1]; break;
case '-': result = buffer[k] - buffer[k-1]; break;
case '*': result = buffer[k] * buffer[k-1]; break;
case '/': result = buffer[k] / buffer[k-1]; break;
}
}
buffer[k++] = result;
}
return buffer[k]; //finally returning the recent value
}
Use a method like this. Will help a lot. Implemented using a stack data structure.
How do I able to replace:
((90+1)%(100-4)) + ((90+1)%(100-4/(6-4))) - (var1%(var2%var3(var4-var5)))
with
XYZ((90+1),(100-4)) + XYZ((90+1),100-4/(6-4)) - XYZ(var1,XYZ(var2,var3(var4-var5)))
with regex?
Thanks,
J
this doesn't really look like a very good job for a regex. It looks like you might want to write a quick recursive descent parser instead. If I understand you correctly, you want to replace the infix operator % with a function name XYZ?
So (expression % expression) becomes XYZ(expression, expression)
This looks like a good resource to study: http://www.cs.uky.edu/~lewis/essays/compilers/rec-des.html
I don't know much about regex, but try looking at this, especially 9 and 10:
http://www.mkyong.com/regular-expressions/10-java-regular-expression-examples-you-should-know/
And of course:
http://docs.oracle.com/javase/1.4.2/docs/api/java/util/regex/Pattern.html
You could at least check them out until an in depth answer comes along.
See this code:
String input = "((90+1)%(100-4)) + ((90+1)%(100-4/(6-4))) - (var1%(var2%var3(var4-var5)))";
input = input.replaceAll("%", ",");
int level = 0;
List<Integer> targetStack = new ArrayList<Integer>();
List<Integer> splitIndices = new ArrayList<Integer>();
// add the index of last character as default checkpoint
splitIndices.add(input.length());
for (int i = input.length() - 1; i >= 0; i--) {
if (input.charAt(i) == ',') {
targetStack.add(level - 1);
} else if (input.charAt(i) == ')') {
level++;
}
else if (input.charAt(i) == '(') {
level--;
if (!targetStack.isEmpty() && level == targetStack.get(targetStack.size() - 1)) {
splitIndices.add(i);
}
}
}
Collections.reverse(splitIndices); // reversing the indices so that they are in increasing order
StringBuilder result = new StringBuilder();
for (int i = 1; i < splitIndices.size(); i++) {
result.append("XYZ");
result.append(input.substring(splitIndices.get(i - 1), splitIndices.get(i)));
}
System.out.println(result);
The output is as you expect it:
XYZ((90+1),(100-4)) + XYZ((90+1),(100-4/(6-4))) - XYZ(var1,XYZ(var2,var3(var4-var5)))
However keep in mind that it is a bit hacky and it might not work exactly as you expect it. Btw, I had to change a bit the output I added couple of brackets: XYZ((90+1), ( 100-4/(6-4 ) )) because otherwise you were not following your own conventions. Hopefully this code helps you. For me it was a good exercise at least.
Would it satisfy your requirements to do the following:
Find ( at first position or preceded by space and replace it with XYZ(
Find % and replace it with ,
If those two instructions are sufficient and satisfactory, then you could transform the original string in three "moves":
Replace ^\( with XYZ(
Replace \( with XYZ(
Replace % with ,