I'm rather new to the community but I've seen some helpful posts on here so I thought I'd ask.
I've got a homework question that asks us to recursively check whether a given string is a valid prefix expression given by the two following rules (standard):
Variables (a-z) are prefix expressions
If O is a binary operator and F and E are prefix expressions, OFE
Now, I kind of get the evaluation and have looked at the prefix-to-infix algorithms, but I can't for the life of me figure out how to implement just the evaluation methods (as I only need to check if it's valid, so not +a-b for example).
I know most of the implementation for these problems is done using stacks but I don't see how I would do it recursively here... some help would be tremendously appreciated.
Think of it this way. (I'm not going to write the code, since that's what you need to learn).
You want to check if a certain string is a prefix expression, so you have a function:
boolean isPrefix(string)
Now, there's two way that string could be a prefix:
It's a character from a-z
It's in the format O(prefix)(prefix)
So first, you check if the string has a length of one and is between a-z, and if so, the answer is yes.
Next you can check if the string starts with an O. If it does, you need to test the rest of the string to see if it is composed of two prefix expressions (FE).
So you start iterating from 1 to length, and passing each substring (0->i, i->length) into isPrefix(). If both substrings are also valid prefix expressions, the answer is yes.
Otherwise, the answer is no.
That's pretty much it, but the implementation, however, is up to you.
I'm not sure I entirely understand the point of this, but I imagine you should have some method like checkPrefixIn(String s) that looks at only part of the given String, returns true if it is only a prefix, false if it is only an operator (or invalid character), or the return value of checkPrefixIn(partOfS), where partOfS is a substring of the input s
Related
The following piece of code checks for same variable portion /en(^$|.*) which is empty or any characters. So the expression should match /en AND /en/bla, /en/blue etc.
But the expression doesn't work when checking for just /en.
"/en".matches("/en(^$|.*)")
Is there a way to make this empty regex check (^$) perform with java?
edit
I mean: Is there a way to make this piece of code return true?
What you're currently doing is checking whether en is followed by the start of string then the end of string (which doesn't make sense, since the start of string needs to be first) or anything else. This should work:
"/en".matches("/en(|.*)")
Or just using ? (optional):
"/en".matches("/en(.*)?")
But it's rather pointless, since * is zero or more (so a blank string will match for .*), just this should do it:
"/en".matches("/en.*")
EDIT:
Your code was already returning true, but it was not matching the ^$ part, but rather .* (similar to the above).
I should point out that you may as well use startsWith, unless your real data is more complex:
"/en".startsWith("/en")
Is there a way to make this piece of code return true?
"/en".matches("/en(^$|.*)")
That code does return true. Just try it!
However, your pattern is unnecessarily complex. Try:
"/en".matches("/en.*")
This will match /en followed by anything (including nothing).
I am trying to solve a problem in which I have to solve a given expression consisting of one or more initialization in a same string with no operator precedence (although with bracketed sub-expressions). All the operators have right precedence so I have to evaluate it from right to left. I am confused how to proceed for the given problem. Detailed problem is given here : http://uva.onlinejudge.org/index.php?option=com_onlinejudge&Itemid=8&page=show_problem&problem=108
I'll give you some ideas to try:
First off, you need to recursively evaluate inside brackets. You want to do brackets from most nested to least nested, so use a regex that matches brackets with no ) inside of them. Substring the result of the computation into the part of the string the bracketed expression took up.
If there are no brackets, then now you need to evaluate operators. The reason why the question requires right precedence is to force you to think about how to answer it - you can't just read the string and do calculations. You have to consider the whole string THEN start doing calculations, which means storing some structure describing it. There's a number of strategies you could use to do this, for example:
-You could tokenize the string, either using a scanner or regexes - continually try to see if the next item in the string is a number or which of the operators it is, and push what kind of token it is and its value onto a list. Then, you can evaluate the list from right to left using some kind of case/switch structure to determine what to do for each operator (either that, or each operator is associated with what it does to numbers). = itself would address a map of variable name keys to values, and insert the value under that variable's key, and then return (to be placed into the list) the value it produced, so it can be used for another assignment. It also seems like - can be determined as to whether it's subtraction or a negative number by whether there's a space on its right or not.
-Instead of tokenization, you could use regexes on the string as a whole. But tokenization is more robust. I tried to build a calculator based on applying regexes to the whole string over and over but it's so difficult to get all the rules right and I don't recommend it.
I've written an expression evaluating calculator like this before, so you can ask me questions if you run into specific problems.
Pretty simple question and my brain is frozen today so I can't think of an elegant solution where I know one exists.
I have a formula which is passed to me in the form "A+B"
I also have a mapping of the formula variables to their "readable names".
Finally, I have a formula parser which will calculate the value of the formula, but only if its passed with the readable names for the variables.
For example, as an input I get
String formula = "A+B"
String readableA = "foovar1"
String readableB = "foovar2"
and I want my output to be "foovar1+foovar2"
The problem with a simple find and replace is that it can be easily be broken because we have no guarantees on what the 'readable' names are. Lets say I take my example again with different parameters
String formula = "A+B"
String readableA = "foovarBad1"
String readableB = "foovarAngry2"
If I do a simple find and replace in a loop, I'll end up replacing the capital A's and B's in the readable names I have already replaced.
This looks like an approximate solution but I don't have brackets around my variables
How to replace a set of tokens in a Java String?
That link you provided is an excellent source since matching using patterns is the way to go. The basic idea here is first get the tokens using a matcher. After this you will have Operators and Operands
Then, do the replacement individually on each Operand.
Finally, put them back together using the Operators.
A somewhat tedious solution would be to scan for all occurences of A and B and note their indexes in the string, and then use StringBuilder.replace(int start, int end, String str) method. (in naive form this would not be very efficient though, approaching smth like square complexity, or more precisely "number of variables" * "number of possible replacements")
If you know all of your operators, you could do split on them (like on "+") and then replace individual "A" and "B" (you'd have to do trimming whitespace chars first of course) in an array or ArrayList.
A simple way to do it is
String foumula = "A+B".replaceAll("\\bA\\b", readableA)
.replaceAll("\\bB\\b", readableB);
Your approach does not work fine that way
Formulas (mathematic Expressions) should be parsed into an expression structure (eg. expression tree).
Such that you have later Operand Nodes and Operator nodes.
Later this expression will be evaluated traversing the tree and considering the mathematical priority rules.
I recommend reading more on Expression parsing.
Matching Only
If you don't have to evaluate the expression after doing the substitution, you might be able to use a regex. Something like (\b\p{Alpha}\p{Alnum}*\b)
or the java string "(\\b\\p{Alpha}\\p{Alnum}*\\b)"
Then use find() over and over to find all the variables and store their locations.
Finally, go through the locations and build up a new string from the old one with the variable bits replaced.
Not that It will not do much checking that the supplied expression is reasonable. For example, it wouldn't mind at all if you gave it )A 2 B( and would just replace the A and B (like )XXX 2 XXX(). I don't know if that matters.
This is similar to the link you supplied in your question except you need a different regular expression than they used. You can go to http://www.regexplanet.com/advanced/java/index.html to play with regular expressions and figure out one that will work. I used it with the one I suggested and it finds what it needs in A+B and A + (C* D ) just fine.
Parsing
You parse the expression using one of the available parser generators (Antlr or Sable or ...) or find an algebraic expression parser available as open source and use it. (You would have to search the web to find those, I haven't used one but suspect they exist.)
Then you use the parser to generate a parsed form of the expression, replace the variables and reconstitute the string form with the new variables.
This one might work better but the amount of effort depends on whether you can find existing code to use.
It also depends on whether you need to validate the expression is valid according to the normal rules. This method will not accept invalid expressions, most likely.
I am using a regular expression for image file names.
The main reason why I'm using RegEx's is to prevent multiple files for the exact same purpose.
The syntax for the filenames can either be:
1) img_0F_16_-32_0.png
2) img_65_32_x.png
As you might have noticed, "img_" is the general prefix.
What follows is a two-digit hexadecimal number.
After another underscore comes an integer that has to be a power of two, somewhere between 1 through 512. Yet another underscore is next.
Okay so this far, my regular expression is working flawlessly.
The rest is what I'm having problems with:
Because what can follow is either a pair of integer coordinates (can be 0), separated by an underscore, or an x. After this comes the final ".png". Done.
Now the main problem I am having is that both variants have to be possible,
and also it is highly important that there may not be any duplicate coordinates.
Most importantly, integers, both positive and negative, may never start with one or more zeros!
This would produce duplications like:
401 = 00401
-10 = -0010
This is my first attempt:
img_[0-9a-fA-F]{2}_(1|2|4|8|16|32|64|128|256|512)_([-]?[1-9])?[0-9]*_([-]?[1-9])?[0-9]*[.]png
Thanks for your help in advance,
Tom S.
Why use regular expressions? Why not create a class that decomposes either variant of String to a canonical String, give the class a hashCode() and equals() method that uses this canonical String and then create a HashSet of these objects to make sure that only one of these types of files exist?
I am doing string manipulations and I need more advanced functions than the original ones provided in Java.
For example, I'd like to return a substring between the (n-1)th and nth occurrence of a character in a string.
My question is, are there classes already written by users which perform this function, and many others for string manipulations? Or should I dig on stackoverflow for each particular function I need?
Check out the Apache Commons class StringUtils, it has plenty of interesting ways to work with Strings.
http://commons.apache.org/lang/api-2.3/index.html?org/apache/commons/lang/StringUtils.html
Have you looked at the regular expression API? That's usually your best bet for doing complex things with strings:
http://download.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html
Along the lines of what you're looking to do, you can traverse the string against a pattern (in your case a single character) and match everything in the string up to but not including the next instance of the character as what is called a capture group.
It's been a while since I've written a regex, but if you were looking for the character A for instance, then I think you could use the regex A([^A]*) and keep matching that string. The stuff in the parenthesis is a capturing group, which I reference below. To match it, you'd use the matcher method on pattern:
http://download.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html#matcher%28java.lang.CharSequence%29
On the Matcher instance, you'd make sure that matches is true, and then keep calling find() and group(1) as needed, where group(1) would get you what is in between the parentheses. You could use a counter in your looping to make sure you get the n-1 instance of the letter.
Lastly, Pattern provides flags you can pass in to indicate things like case insensitivity, which you may need.
If I've made some mistakes here, then someone please correct me. Like I said, I don't write regexes every day, so I'm sure I'm a little bit off.