Searching approach to evaluate domain specific expression - java

I'm working on a tool in the context of a java project to evaluate a custom domain specific, rule-like expression like
min-5 avg datalist > Number
with the individual tokens meaning the following:
min-5 : optional minimum (or maximum, in that case max-5) occurences of the following term
avg : an optional aggregation function which operates on the following token datalist (can also be sum or anything similar)
datalist : A list of data (type: integer/ double) which will be available before the evaluation of the entire expression starts, can be reduced to a single value by the preceding aggregation function
operator: conditional operator < or > or =
Number: value for the conditional operator
Note(s):
The optional amount of occurrences and the aggregation can not happen both, that would make no sense.
There can be multiple of the above expressions, chained with and/or
These expressions are external input, not pre-defined
The evaluation of this expression should output a boolean
As I am rather new to expression evaluation / parsing I am searching for an elegant way to solve this, possibly with a java framework/tool.
What I've tried so far:
Parsing by hand which turned out not so nicely
Trying to use Janino Expression Evaluator, but I don't know how to handle this programmatically
I am searching for a solution to solve this in an elegant way, I am thankful for any suggestions

what you try to do is a DSL (domain specific language) and the elegant way to solve your issue is to create a grammar for yuor specific language that help you on parsing function.
Take a look at JavaCC or Antlr.

Related

How to store mathematical formula in MS SQL Server DB and interpret it using JAVA?

I have to give the user the option to enter in a text field a mathematical formula and then save it in the DB as a String. That is easy enough, but I also need to retrieve it and use it to do calculations.
For example, assume I allow someone to specify the formula of employee salary calculation which I must save in String format in the DB.
GROSS_PAY = BASIC_SALARY - NO_PAY + TOTAL_OT + ALLOWANCE_TOTAL
Assume that terms such as GROSS_PAY, BASIC_SALARY are known to us and we can make out what they evaluate to. The real issue is we can't predict which combinations of such terms (e.g. GROSS_PAY etc.) and other mathematical operators the user may choose to enter (not just the +, -, ×, / but also the radical sigh - indicating roots - and powers etc. etc.). So how do we interpret this formula in string format once where have retrieved it from DB, so we can do calculations based on the composition of the formula.
Building an expression evaluator is actually fairly easy.
See my SO answer on how to write a parser. With a BNF for the range of expression operators and operands you exactly want, you can follow this process to build a parser for exactly those expressions, directly in Java.
The answer links to a second answer that discusses how to evaluate the expression as you parse it.
So, you read the string from the database, collect the set of possible variables that can occur in the expression, and then parse/evaluate the string. If you don't know the variables in advance (seems like you must), you can parse the expression twice, the first time just to get the variable names.
as of Evaluating a math expression given in string form there is a JavaScript Engine in Java which can execute a String functionality with operators.
Hope this helps.
You could build a string representation of a class that effectively wraps your expression and compile it using the system JavaCompiler — it requires a file system. You can evaluate strings directly using javaScript or groovy. In each case, you need to figure out a way to bind variables. One approach would be to use regex to find and replace known variable names with a call to a binding function:
getValue("BASIC_SALARY") - getValue("NO_PAY") + getValue("TOTAL_OT") + getValue("ALLOWANCE_TOTAL")
or
getBASIC_SALARY() - getNO_PAY() + getTOTAL_OT() + getALLOWANCE_TOTAL()
This approach, however, exposes you to all kinds of injection type security bugs; so, it would not be appropriate if security was required. The approach is also weak when it comes to error diagnostics. How will you tell the user why their expression is broken?
An alternative is to use something like ANTLR to generate a parser in java. It's not too hard and there are a lot of examples. This approach will provide both security (users can't inject malicious code because it won't parse) and diagnostics.

Replacing Mathematical Operators with Variable Java

Currently trying to do a reverse polish calculator for one of my Uni homework tasks.
I have the program working fine when using a bunch of if/else statements to work out which operator was typed in and do the mathematical operation normally like num1+num2.
This is beyond the homework task we were set but what I'm trying to do now is replace the operator with a variable for example: "num1 + num2" would become "num1 variable num2" where variable equals "+" etc.
Is there a way to do this in Java?
Thanks in advance
Since you are interested in going beyond the scope of the training material, and assuming you've learned about interfaces already, I believe what you are looking for is a binary expression tree (that wikipedia article actually explains it good).
Basically, you create an interface Expression with a double compute() method. There will be two types of implementing classes:
Operand: Constant and Variable.
Operator: Plus, Minus, Multiply, Divide. Each will have two Expression fields: left and right.
Your text expression is then parsed into the expression tree:
// input: "num1 + num2 * 3"
// result expression tree will be built by parser using:
Expression a = new Variable("num1");
Expression b = new Variable("num2");
Expression c = new Constant(3);
Expression d = new Multiply(b, c);
Expression e = new Plus(a, d);
Map<String, Double> variables = /*defined elsewhere*/;
double result = e.compute(variables);
Your new assignment, should you choose to accept it, will be to write the expression classes and a parser to build the expression tree from a text expression.
Hope this will encourage you to go way beyond the training material, having some fun while playing.
First you can use a switch on String rather than if-then-else chain. Another way is to build a static final Map (E.g. HashMap) from String to Function. The strings are the operators. The Functions do the operation. In Java 8 you can give the functions as lambdas. I only have access through a phone right now so can't show code. Your question will be much better received if you add code showing what you mean.

How to use or implement arrays in XQuery?

Is there any built in support for array in XQuery? For example, if we want to implement
the simple java program in xquery how we would do it:
(I am not asking to translate the entire program into xquery, but just asking
how to implement the array in line number 2 of the below code to xquery? I am
using marklogic / xdmp functions also).
java.lang.String test = new String("Hello XQuery");
char[] characters = test.toCharArray();
for(int i = 0; i<characters.length; i++) {
if(character[i] == (char)13) {
character[i] = (char) 0x00;
}
}
Legend:
hex 0x00 dec 0 : null
hex 0x0d dec 13: carriage return
hex 0x0a dec 10: line feed
hex 0x20 dec 22: dquote
The problem with converting your sample code to XQuery is not the absence of support for arrays, but the fact that x00 is not a valid character in XML. If it weren't for this problem, you could express your query with the simple function call:
translate($input, '', '')
Now, you could argue that's cheating, it just happens so that there's a function that does exactly what you are trying to do by hand. But if this function didn't exist, you could program it in XQuery: there are sufficient primitives available for strings to allow you to manipulate them any way you want. If you need to (and it's rarely necessary) you can convert a string to a sequence of integers using the function string-to-codepoints(), and then take advantage of all the XQuery facilities for manipulating sequences.
The lesson is, when you use a declarative language like XQuery or XSLT, don't try to use the same low-level programming techniques you were forced to use in more primitive languages. There's usually a much more direct way of expressing the problem.
XQuery has built-in support for sequences. The function tokenize() (as suggested by #harish.ray) returns a sequence. You can also construct one yourself using braces and commas:
let $mysequence = (1, 2, 3, 4)
Sequences are ordered lists, so you can rely on that. That is slightly different from a node-set returned from an XPath, those usually are document-ordered.
On a side mark: actually, everything in XQuery is either a node-set or a sequence. Even if a function is declared to return one string or int, you can treat that returned value as if it is a sequence of one item. No explicit casting is necessary, for which there are no constructs in XQuery anyhow. Functions like fn:exists() and fn:empty() always work.
HTH!
Just for fun, here's how I would do this in XQuery if fn:translate did not exist. I think Michael Kay's suggestion would end up looking similar.
let $test := "Hello XQuery"
return codepoints-to-string(
for $c in string-to-codepoints($test)
return if ($c eq 32) then 44 else $c)
Note that I changed the transformation because of the problem he pointed: 0 is not a legal codepoint. So instead I translated spaces to commas.
With MarkLogic, another option is to use http://docs.marklogic.com/json:array and its associated functions. The json:set-item-at function would allow coding in a vaguely imperative style. Coding both variations might be a good learning exercise.
There are two ways to do this.
Firstly you can create an XmlResults object using
XmlManager.createResults(), and use XmlResults.add() to add your
strings to this. You can then use the XmlResults object to set the
value of a variable in XmlQueryContext, which can be used in your
query.
Example:
XmlResults values = XMLManager.createResults();
values.add(new XmlValue("value1"));
values.add(new XmlValue("value2"));
XmlQueryContext.setVariableValue("files", values);
The alternative is to split the string in XQuery. You
can do this using the tokenize() function, which works using a
regular expression to match the string separator.
http://www.w3.org/TR/xpath-functions/#func-tokenize
Thanks.
A little outlook: XQuery 3.1 will provide native support for arrays. See http://www.w3.org/TR/xquery-31/ for more details.
You can construct an array like this:
$myArray = tokenize('a b c d e f g', '\s')
// $myArray[3] -> c
Please note that the first index of this pseudo-array is 1 not 0!
Since the question "How to use or implement arrays in XQuery?" is being held generic (and thus shows up in search results on this topic), I would like to add a generic answer for future reference (making it a Community Wiki, so others may expand):
As Christian Grün has already hinted at, with XQuery 3.1 XQuery got a native array datatype, which is a subtype of the function datatype.
Since an array is a 'ordered list of values' and an XPath/XQuery sequence is as well, the first question, which may arise, is: "What's the difference?" The answer is simple: a sequence can not contain another sequence. All sequences are automatically flattened. Not so an array, which can be an array of arrays. Just like sequences, arrays in XQuery can also have any mix of any other datatype.
The native XQuery array datatype can be expressed in either of two ways: As [] or via array {}. The difference being, that, when using the former constructor, a comma is being considered a 'hard' comma, meaning that the following array consists of two members:
[ ("apples", "oranges"), "plums" ]
while the following will consist of three members:
array { ("apples", "oranges"), "plums" }
which means, that the array expression within curly braces is resolved to a flat sequence first, and then memberized into an array.
Since Array is a subtype of function, an array can be thought of as an anonymous function, that takes a single parameter, the numeric index. To get the third member of an array, named $foo, we thus can write:
$foo(3)
If an array contains another array as a member you can chain the function calls together, as in:
$foo(3)(5)
Along with the array datatype, special operators have been added, which make it easy to look up the values of an array. One such operator (also used by the new Map datatype) is the question mark followed by an integer (or an expression that evaluates to zero or more integers).
$foo?(3)
would, again, return the third member within the array, while
$foo?(3, 6)
would return the members 3 and 6.
The parenthesis can be left out, when working with literal integers. However, the parens are needed, to form the lookup index from a dynamic expression, like in:
$foo?(3 to 6)
here, the expression in the parens gets evaluated to a sequence of integers and thus the expression would return a sequence of all members from index position 3 to index position 6.
The asterisk * is used as wildcard operator. The expression
$foo?*
will return a sequence of all items in the array. Again, chaining is possible:
$foo?3?5
matches the previos example of $foo(3)(5).
More in-depth information can be found in the official spec: XML Path Language (XPath) 3.1 / 3.11.2 Arrays
Also, a new set of functions, specific to arrays, has been implemented. These functions resinde in the namespace http://www.w3.org/2005/xpath-functions/array, which, conventionally, is being prefixed with array and can be found referenced in here: XPath and XQuery Functions and Operators 3.1 / 17.3 Functions that Operate on Arrays

Matrix expression parser/engine

I am looking for a matrix expression parser/engine. For example,
3 * A + B * C
where A, B, C are matrices is a typical expression. This should be similar to (single value) math expression parser/engine but should handle matrix value and variable. I've already googled in vain. I am also willing to modify existing math expression parser but I am not sure how I can go about it. So if you can give me any clue or hint, I will appreciate it.
See my answer on how to build simple parsers. This especially suited for expression parsers.
It is pretty easy to modify such a parser to compute the answer as it parses. Just add an action routine whenever the parser recognizes syntax, to do what the syntax says.

Most elegant isNumeric() solution for java

I'm porting a small snippet of PHP code to java right now, and I was relying on the function is_numeric($x) to determine if $x is a number or not. There doesn't seem to be an equivalent function in java, and I'm not satisfied with the current solutions I've found so far.
I'm leaning toward the regular expression solution found here: http://rosettacode.org/wiki/Determine_if_a_string_is_numeric
Which method should I use and why?
Note that the PHP isNumeric() function will correctly determine that hex and scientific notation are numbers, which the regex approach you link to will not.
One option, especially if you are already using Apache Commons libraries, is to use NumberUtils.isNumber(), from Commons-Lang. It will handle the same cases that the PHP function will handle.
Have you looked into using StringUtils library?
There's a isNumeric() function which might be what you're looking for.
(Note that "" would be evaluated to true)
It's usually a bad idea to have a number in a String. If you want to use this number then parse it and use it as a numeric. You shouldn't need to "check" if it's a numeric, either you want to use it as a numeric or not.
If you need to convert it, then you can use every parser from Integer.parseInt(String) to BigDecimal(String)
If you just need to check that the content can be seen as a numeric then you can get away with regular expressions.
And don't use the parseInt if your string can contain a float.
Optionally you can use a regular expression as well.
if (theString.matches("((-|\\+)?[0-9]+(\\.[0-9]+)?)+")))
return true;
return false;
Did you try Integer.parseInt()? (I'm not sure of the method name, but the Integer class has a method that creates an Integer object from strings). Or if you need to handle non-integer numbers, similar methods are available for Double objects as well. If these fail, an exception is thrown.
If you need to parse very large numbers (larger than int/double), and don't need the exact value, then a simple regex based method might be sufficient.
In a strongly typed language, a generic isNumeric(String num) method is not very useful. 13214384348934918434441 is numeric, but won't fit in most types. Many of those where is does fit won't return the same value.
As Colin has noted, carrying numbers in Strings withing the application is not recommended. The isNumberic function should only be applicable for input data on interface methods. These should have a more precise definition than isNumeric. Others have provided various solutions. Regular expressions can be used to test a number of conditions at once, including String length.
Just use
if((x instanceof Number)
//if checking for parsable number also
|| (x instanceof String && x.matches("((-|\+)?[0-9]+(\.[0-9]+)?)+"))
){
...
}
//---All numeric types including BigDecimal extend Number

Categories