How get Expression tree of regex in Java? - java

I'm working in the conversion algorithm obtains a DFA from a regular expression. This algorithm is limited to only operators (*, |, . ).
For those who do not know the meaning of each operator can check this.
The algorithm analyzes the nodes of a tree that is created from a regex
Here I attached an image showing a table with the functions voidable first position and last position which is applied to each node of the tree created.
For example: If apply for this regex (a│b)*a(a│b) analysis with the table.
The first step of the algorithm is to add the symbol # at the end (a│b)*a(a│b)# and enumerate each symbol:
Later the tree is constructed (my problem) and each node is discussed in the above table, remaining so. To the right of the node in {} PmraPos(first position) and left at {} UtmaPos (end position).
Problem: In java I was trying to build the tree we spoke with Stack but got good results, because, as you can see in the picture, not all nodes have two children. I want help for building the tree.
Note: What I did to try to build the tree, it was to pass the regular expression to its postfix form.

Related

Creating a String expression when given an expression tree

I am having trouble creating a String expression when given an expression tree. If my expression tree looks like this (in the output console):
(*(+(5)(-(2)(3)))(6))
How do I create a method that goes through this to create an expression that is in normal format? For example, like this:
(2 - 3 + 5) * 6
Should I be working with the actual expression tree or the String orientation of the expression tree (as shown above as: (*(+(5)(-(2)(3)))(6))).
You should use prefix to infix conversion algorithm.
It's because your expression tree string is in prefix form and you want it in infix form.
You can remove all the braces in input string. That way it will be easier.
About that I advise you to read these documents.
Shunting-yard algorithm: https://en.wikipedia.org/wiki/Shunting-yard_algorithm
This algorithm is about 'tokens' stacking according to their "precedence power", per example, a function between parenthesis comes first. As for that read these:
https://en.wikipedia.org/wiki/Order_of_operations
http://introcs.cs.princeton.edu/java/11precedence/ (This one is specific for programming)
I hope I have helped.
Have a nice day. :)

building a tree from left parenthetic string representation

I need to build a right parenthetic representation of a string from left parenthetic representation. Basically this means parsing a String input and later rebuild a right parenthetic representation. I need to implement 2 methods: One that would parse the input and one that creates the needed representation from that parsed input. This is part of a homework I need to do in java.
The code how I would test this:
String s = "A(B1,C)";
Node t = Node.parse (s);
String v = t.rightParentheticRepresentation();
System.out.println (s + " ==> " + v); // A(B1,C) ==> (B1,C)A
So i need to implement 2 methods: Node parse(String s) and String rightParentheticRepresentation()
I know in theory I have some idea how I should go on about doing it but I am struggling to implement the parsing method.
Is there any existing implementations I could use? Any hint for implementation approach is very welcome or if someone knows any good tutorial on building trees from string representation.
First you should get an idea of the data structure that you want to build. Basically here you want a tree where each node correspond to a content inside some parenthesis (the initial parenthis are implicit - '(' A(B1,C) ')' - in your sample).
For the parsing method: read the input String char by char. Whenever you meet an opening parenthesis '(' you create a child to the current node and change current to the new node, then start filling it. When you meet a closing parenthesis ')' you finalize the current node and come back to its parent.

expression evaluation with right-associativity in java

I am trying to solve a problem in which I have to solve a given expression consisting of one or more initialization in a same string with no operator precedence (although with bracketed sub-expressions). All the operators have right precedence so I have to evaluate it from right to left. I am confused how to proceed for the given problem. Detailed problem is given here : http://uva.onlinejudge.org/index.php?option=com_onlinejudge&Itemid=8&page=show_problem&problem=108
I'll give you some ideas to try:
First off, you need to recursively evaluate inside brackets. You want to do brackets from most nested to least nested, so use a regex that matches brackets with no ) inside of them. Substring the result of the computation into the part of the string the bracketed expression took up.
If there are no brackets, then now you need to evaluate operators. The reason why the question requires right precedence is to force you to think about how to answer it - you can't just read the string and do calculations. You have to consider the whole string THEN start doing calculations, which means storing some structure describing it. There's a number of strategies you could use to do this, for example:
-You could tokenize the string, either using a scanner or regexes - continually try to see if the next item in the string is a number or which of the operators it is, and push what kind of token it is and its value onto a list. Then, you can evaluate the list from right to left using some kind of case/switch structure to determine what to do for each operator (either that, or each operator is associated with what it does to numbers). = itself would address a map of variable name keys to values, and insert the value under that variable's key, and then return (to be placed into the list) the value it produced, so it can be used for another assignment. It also seems like - can be determined as to whether it's subtraction or a negative number by whether there's a space on its right or not.
-Instead of tokenization, you could use regexes on the string as a whole. But tokenization is more robust. I tried to build a calculator based on applying regexes to the whole string over and over but it's so difficult to get all the rules right and I don't recommend it.
I've written an expression evaluating calculator like this before, so you can ask me questions if you run into specific problems.

Auto Suggest : Substring matching

I am trying to implement auto suggest using ternary search tree(TST),but TST is useful when we are looking for prefix searches, how can we implement Auto Suggest for sub string matches also?
Is there any other data structure which can be used?
Eg of substring matches :
When I am trying to search for UML using auto suggest , even the string "Beginners Guide for UML" should match.
This is from the top of my head, not any proper and proven data structure/algorithm:
Select a mapping of all legal characters to N symbols (for simplicity: 26 symbols for latin letters and similar non-latin letters ignoring case + 1 for non-letters = total 27 symbols).
From your dictionary, create a shallow tree with max branching factor of N (ie. quite high). Leaf nodes would contain references to all words which contain the symbol combo leading from root to that leaf, (intermediate nodes might contain references to words which are shorter than depth of a leaf node, or you could just ignore words which are that short).
Depth of tree would be variable, probably in range of 1..4, so that each leaf node would contain about same number of words (same word of course listed under many leaves, like MATCH under leaves MAT, ATC, TCH if tree depth happened to be 3).
When user is entering letters, follow the tree as far as it goes, until you're left with relatively small set of words. Then do linear filtering on remaining words after you're at leaf of the tree and user enters more text to match. Optionally filter out symbol matches which actually aren't character matches, though it might be nice to match also äöO when user enters ao0, etc.
Optimize number of symbols you map your chars to, to have good branching factor for the tree. Optimize words per leaf to have decent memory usage without having too many words to filter linearly after reaching leaf of the tree.
Of course there are actual researched algorithms for finding a string (what user entered) in a large piece of text (all the phrases/words you want to match), like Aho–Corasick and Rabin–Karp, which are probably worth investigating.

Java Binary Search Tree Prefixes

I am writing a program in Java that takes words and definitions, places both the word and definition in a node object, and places that node in a binary search tree dictionary sorted lexicographically by word.
I am trying to create an option for the user to find all tree words that begin with a certain prefix of letters. For instance, given the input "ap", the program might return the words "appease", "apple", "apply", "apron", etc. However, I have no idea how to implement this. My binary search tree class has a find method and a traversal method (using iterators), but I don't know how to use those to search through the node objects, as the dictionary class (that stores the nodes in the tree) cannot handle anything like that. Does anyone have any ideas on how to tackle this?

Categories