Match on Options inside Tuple with Vavr - java

Using Vavr's types, I have created a pair of Somes:
var input = Tuple(Some(1), Some(2));
I'd like to get at the integers 1 and 2 using Vavr's match expression; this is how I currently do it:
import static io.vavr.API.*;
import static io.vavr.Patterns.$Some;
import static io.vavr.Patterns.$Tuple2;
var output = Match(input).of(
Case($Tuple2($Some($()), $Some($())),
(fst, snd) -> fst.get() + "/" + snd.get()),
Case($(), "No match")
);
This works and returns "1/2" but has me worried since I call the unsafe get methods on the two Somes.
I'd rather have the match expression decompose input to the the point where it extracts the innermost integers.
This note in Vavr's user guide makes me doubt whether that's possible:
⚡ A first prototype of Vavr’s Match API allowed to extract a user-defined selection of objects from a match pattern. Without proper compiler support this isn’t practicable because the number of generated methods exploded exponentially. The current API makes the compromise that all patterns are matched but only the root patterns are decomposed.
Yet I'm still curious whether there's a more elegant, type-safe way to decompose the nested value input.

I would use Tuple.apply(*) combined with API.For(*) in the following way:
var output = input.apply(API::For)
.yield((i1, i2) -> i1 + "/" + i2)
.getOrElse("No match");
(*): links are provided to the two argument overloads to conform to your example

Related

Replacing Mathematical Operators with Variable Java

Currently trying to do a reverse polish calculator for one of my Uni homework tasks.
I have the program working fine when using a bunch of if/else statements to work out which operator was typed in and do the mathematical operation normally like num1+num2.
This is beyond the homework task we were set but what I'm trying to do now is replace the operator with a variable for example: "num1 + num2" would become "num1 variable num2" where variable equals "+" etc.
Is there a way to do this in Java?
Thanks in advance
Since you are interested in going beyond the scope of the training material, and assuming you've learned about interfaces already, I believe what you are looking for is a binary expression tree (that wikipedia article actually explains it good).
Basically, you create an interface Expression with a double compute() method. There will be two types of implementing classes:
Operand: Constant and Variable.
Operator: Plus, Minus, Multiply, Divide. Each will have two Expression fields: left and right.
Your text expression is then parsed into the expression tree:
// input: "num1 + num2 * 3"
// result expression tree will be built by parser using:
Expression a = new Variable("num1");
Expression b = new Variable("num2");
Expression c = new Constant(3);
Expression d = new Multiply(b, c);
Expression e = new Plus(a, d);
Map<String, Double> variables = /*defined elsewhere*/;
double result = e.compute(variables);
Your new assignment, should you choose to accept it, will be to write the expression classes and a parser to build the expression tree from a text expression.
Hope this will encourage you to go way beyond the training material, having some fun while playing.
First you can use a switch on String rather than if-then-else chain. Another way is to build a static final Map (E.g. HashMap) from String to Function. The strings are the operators. The Functions do the operation. In Java 8 you can give the functions as lambdas. I only have access through a phone right now so can't show code. Your question will be much better received if you add code showing what you mean.

Finding duplicate expressions/parameters

I have structure as below
Parameter -> Condition -> Rule
Let say i need to create a Business rule, Customer Age > 18
I have two parameters, Customer Age (P1) and 18(P2), where P1 is Field Parameter (Ognl) and P2 is constant Parameter with value 18.
So my Condition now is , Customer Age > 18 and so as my Rule.
Problem Statement : Avoid user from creating duplicate parameter/condition and rules.
Solution : Constant Parameters, Field Parameters etc i can check in DB and compare if already present.
Now condition for me,
Customer Age > 18 and 18 < Customer Age is same in business terms.
The above cases can be more complex.
(a + b) * (c + d) is same as (b + a) * (d + c)
I need to validate the above expressions.
First Approach - Load all expression from DB (Can be 10000's) and compare using Stack/Tree Structure, which will really kill my objective.
Second Approach - I was thinking of building power full, let say hashcode generator or we can say one int value against every expression (considering operators/brackets also). this value should be generated in such a way that it validates above expression.
Means a + b and b + a should generate same int value, and a - b and b - a should generate different.
Maybe a simplified version of your first approach: What about filtering only the relevant expressions by looking for similar content as you are about to insert into the database?
If you know that you are about to insert Customer Age you can find all expressions containing this parameter and build the stack/tree based on this reduced set of expressions.
I think that you cannot avoid writing a parser of expressions, building an AST of the expressions and code rewrite rules to detect expressions equivalence.
It may not be as time consuming as you think.
For the parsing and AST building part, you can start from exp4j:
http://www.objecthunter.net/exp4j/
For the rewrite rules, you can have a look at: Strategies for simplifying math expressions
For a 100% safe solution you should analyze the expressions with a computer algebra system to see whether there are mathemiatically equal. But that's not so easy.
A pragmatic approach that can be to test whether two expressions are similar:
Check whether they have the same variables
Compare their outputs for a number of different inputs, see if the outputs are equal
You can store the variable list and outputs for a predefined set of inputs as a "hash" for the expression. This hash does not give a guarentee that two expresions are equal, but you could present expressions with the same hash to the user asking if this new rule is equal to one of these similar ones.

Shortest possible resulting length after iterated string replacement

How would I go about reasonably efficiently finding the shortest possible output given by repeatedly applying replacements to an input sequence? I believe (please correct me if I am wrong) that this is exponential-time in the worst case, but I am not sure due to the second constraint below. The naive method certainly is.
I tried coding the naive method (for all possible replacements, for all valid positions, recurse on a copy of the input after applying the replacement at the position. Return the shortest of all valid recursions and the input, with a cache on the function to catch equivalent replacement sequences), but it is (unworkably) slow, and I'm pretty sure it's an algorithmic issue as opposed to the implementation.
A couple of things that may (or may not) make a difference:
Token is an enumerated type.
The length of the output of each entry in the map is strictly less than the input of the entry.
I do not need what replacements were done and where, just the resulting sequence.
So, as an example where each character is a token (for simplicity's sake), if I have the replacement map as aaba -> a, aaa -> ab, and aba -> bb, and I apply minimalString('aaaaa'), I want to get 'a'.
The actual method signature is something along the following lines:
List<Token> getMinimalAfterReplacements(List<Token> inputList, Map<List<Token>, List<Token>> replacements) {
?
}
Is there a better method than brute-force? If not, is there, for example, a SAT library or similar that could be harnessed? Is there any preprocessing to the map that could be done to make it faster when called multiple times with different token lists but with the same replacement map?
The code below is a Python version to find the shortest possible reduction. It is non-recursive but not too far from the naive algorithm. In every step it tries all possible single reductions, thus, obtaining a set of strings to reduce for the next step.
One optimization that helps in cases when there are "symbol eating" rules like "aa" -> "a" is to check the next set of strings for duplicates.
Another optimization (not implemented in the code below) would be to process the replacement rules into a finite automaton that finds locations of all possible single reductions with a single pass through the input string. This would not help the exponential nature of the main tree search algorithm, though.
class Replacer:
def __init__(self, replacements):
self.replacements = [[tuple(key), tuple(value)] for key, value in replacements.items()]
def get_possible_replacements(self, input):
"Return all possible variations where a single replacement was done to the input"
result = []
for replace_what, replace_with in self.replacements:
#print replace_what, replace_with
for p in range(1 + len(input) - len(replace_what)):
if input[p : p + len(replace_what)] == replace_what:
input_copy = list(input[:])
input_copy[p : p + len(replace_what)] = replace_with
result.append(tuple(input_copy))
return result
def get_minimum_sequence_list(self, input):
"Return the shortest irreducible sequence that can be obtained from the given input"
irreducible = []
to_reduce = [tuple(input)]
to_reduce_new = []
step = 1
while to_reduce:
print "Reduction step", step, ", number of candidates to reduce:", len(to_reduce)
step += 1
for current_input in to_reduce:
reductions = self.get_possible_replacements(current_input)
if not reductions:
irreducible.append(current_input)
else:
to_reduce_new += reductions
to_reduce = set(to_reduce_new[:]) # This dramatically reduces the tree width by removing duplicates
to_reduce_new = []
irreducible_sorted = sorted(set(irreducible), key = lambda x: len(x))
#print "".join(input), "could be reduced to any of", ["".join(x) for x in irreducible_sorted]
return irreducible_sorted[0]
def get_minimum_sequence(self, input):
return "".join(self.get_minimum_sequence_list(list(input)))
input = "aaaaa"
replacements = {
"aaba" : "a",
"aaa" : "ab",
"aba" : "bb",
}
replacer = Replacer(replacements)
replaced = replacer.get_minimum_sequence(input)
print "The shortest string", input, "could be reduced to is", replaced
Just a simple idea which might reduce the branching: With rules like
ba -> c
ca -> b
and a string like
aaabaacaa
^ ^
you can do two substitutions and their order doesn't matter. This is already sort of covered by memoization, however, there's still a considerable overhead for generating the useless string. So I'd suggest the following rule:
After a substitution on position p, consider only substitutions on positions q such that
q + length(lhs_of_the_rule) > p
i.e., such that don't start to the left of the previous substitutions or they overlap.
As a simple low-level optimization I'd suggest to replace the List<Token> by a String or (or an encapsulated byte[] or short[] or whatever). The lower memory footprint should help the cache and you can index an array by a string element (or two) in order to find out what rules may be applicable for it.

Produce Java literal syntax for arbitarary object in Java

I'm writing a tool to fill an arbitrary Java value object with arbitrary values, output the content in JSON, and output a list of assertions that can be pasted into a unit test.
At the core of this is:
final Method getter = object.getClass().getMethod(getterName, new Class<?>[0] );
System.out.println("assertEquals("
+ getter.invoke(object)
+ ", actual."
+ getter.getName() +
"());");
This outputs lines like:
assertEquals(42, actual.getIntegerValue());
assertEquals(foo, actual.getStringValue());
assertEquals([B#5ae80842, actual.getByteArrayValue());
Note that the string value is not properly quoted, and the byte array is not a Java byte array literal. I can improve this with a method to format the object depending on its type:
... + formatAsLiteral(getter.invoke(object)) ...
static String formatAsLiteral(Object obj) {
if(obj instanceof String) {
return "\"" + obj + "\"";
} else {
return obj.toString();
}
}
But I want to support as many standard types as is practical - including arrays and possibly collections.
Is there a better way, than to add an if() for every type I can think of?
Is there a better way, than to add an if() for every type I can think of?
Here are a few alternatives:
A dispatch table
A callback system works by storing event handlers in an array. When the underlying event is detected the dispatch system loops through the array calling the callback functions in turn.
A lexical scanner
The Lexer class, below, streamlines the task of matching of regular-expression against the input, as well as that of producing Token objects that precisely describe the matched string and its location within the input stream.
A parser generator
The framework generates tree-walker classes using an extended version of the visitor design pattern which enables the implementation of actions on the nodes of the abstract syntax tree using inheritance.
References
Understanding Dean Edwards' addevent JavaScript
A simple lexical scanner in Java
Open Source Parser Generators in Java

How to use or implement arrays in XQuery?

Is there any built in support for array in XQuery? For example, if we want to implement
the simple java program in xquery how we would do it:
(I am not asking to translate the entire program into xquery, but just asking
how to implement the array in line number 2 of the below code to xquery? I am
using marklogic / xdmp functions also).
java.lang.String test = new String("Hello XQuery");
char[] characters = test.toCharArray();
for(int i = 0; i<characters.length; i++) {
if(character[i] == (char)13) {
character[i] = (char) 0x00;
}
}
Legend:
hex 0x00 dec 0 : null
hex 0x0d dec 13: carriage return
hex 0x0a dec 10: line feed
hex 0x20 dec 22: dquote
The problem with converting your sample code to XQuery is not the absence of support for arrays, but the fact that x00 is not a valid character in XML. If it weren't for this problem, you could express your query with the simple function call:
translate($input, '', '')
Now, you could argue that's cheating, it just happens so that there's a function that does exactly what you are trying to do by hand. But if this function didn't exist, you could program it in XQuery: there are sufficient primitives available for strings to allow you to manipulate them any way you want. If you need to (and it's rarely necessary) you can convert a string to a sequence of integers using the function string-to-codepoints(), and then take advantage of all the XQuery facilities for manipulating sequences.
The lesson is, when you use a declarative language like XQuery or XSLT, don't try to use the same low-level programming techniques you were forced to use in more primitive languages. There's usually a much more direct way of expressing the problem.
XQuery has built-in support for sequences. The function tokenize() (as suggested by #harish.ray) returns a sequence. You can also construct one yourself using braces and commas:
let $mysequence = (1, 2, 3, 4)
Sequences are ordered lists, so you can rely on that. That is slightly different from a node-set returned from an XPath, those usually are document-ordered.
On a side mark: actually, everything in XQuery is either a node-set or a sequence. Even if a function is declared to return one string or int, you can treat that returned value as if it is a sequence of one item. No explicit casting is necessary, for which there are no constructs in XQuery anyhow. Functions like fn:exists() and fn:empty() always work.
HTH!
Just for fun, here's how I would do this in XQuery if fn:translate did not exist. I think Michael Kay's suggestion would end up looking similar.
let $test := "Hello XQuery"
return codepoints-to-string(
for $c in string-to-codepoints($test)
return if ($c eq 32) then 44 else $c)
Note that I changed the transformation because of the problem he pointed: 0 is not a legal codepoint. So instead I translated spaces to commas.
With MarkLogic, another option is to use http://docs.marklogic.com/json:array and its associated functions. The json:set-item-at function would allow coding in a vaguely imperative style. Coding both variations might be a good learning exercise.
There are two ways to do this.
Firstly you can create an XmlResults object using
XmlManager.createResults(), and use XmlResults.add() to add your
strings to this. You can then use the XmlResults object to set the
value of a variable in XmlQueryContext, which can be used in your
query.
Example:
XmlResults values = XMLManager.createResults();
values.add(new XmlValue("value1"));
values.add(new XmlValue("value2"));
XmlQueryContext.setVariableValue("files", values);
The alternative is to split the string in XQuery. You
can do this using the tokenize() function, which works using a
regular expression to match the string separator.
http://www.w3.org/TR/xpath-functions/#func-tokenize
Thanks.
A little outlook: XQuery 3.1 will provide native support for arrays. See http://www.w3.org/TR/xquery-31/ for more details.
You can construct an array like this:
$myArray = tokenize('a b c d e f g', '\s')
// $myArray[3] -> c
Please note that the first index of this pseudo-array is 1 not 0!
Since the question "How to use or implement arrays in XQuery?" is being held generic (and thus shows up in search results on this topic), I would like to add a generic answer for future reference (making it a Community Wiki, so others may expand):
As Christian Grün has already hinted at, with XQuery 3.1 XQuery got a native array datatype, which is a subtype of the function datatype.
Since an array is a 'ordered list of values' and an XPath/XQuery sequence is as well, the first question, which may arise, is: "What's the difference?" The answer is simple: a sequence can not contain another sequence. All sequences are automatically flattened. Not so an array, which can be an array of arrays. Just like sequences, arrays in XQuery can also have any mix of any other datatype.
The native XQuery array datatype can be expressed in either of two ways: As [] or via array {}. The difference being, that, when using the former constructor, a comma is being considered a 'hard' comma, meaning that the following array consists of two members:
[ ("apples", "oranges"), "plums" ]
while the following will consist of three members:
array { ("apples", "oranges"), "plums" }
which means, that the array expression within curly braces is resolved to a flat sequence first, and then memberized into an array.
Since Array is a subtype of function, an array can be thought of as an anonymous function, that takes a single parameter, the numeric index. To get the third member of an array, named $foo, we thus can write:
$foo(3)
If an array contains another array as a member you can chain the function calls together, as in:
$foo(3)(5)
Along with the array datatype, special operators have been added, which make it easy to look up the values of an array. One such operator (also used by the new Map datatype) is the question mark followed by an integer (or an expression that evaluates to zero or more integers).
$foo?(3)
would, again, return the third member within the array, while
$foo?(3, 6)
would return the members 3 and 6.
The parenthesis can be left out, when working with literal integers. However, the parens are needed, to form the lookup index from a dynamic expression, like in:
$foo?(3 to 6)
here, the expression in the parens gets evaluated to a sequence of integers and thus the expression would return a sequence of all members from index position 3 to index position 6.
The asterisk * is used as wildcard operator. The expression
$foo?*
will return a sequence of all items in the array. Again, chaining is possible:
$foo?3?5
matches the previos example of $foo(3)(5).
More in-depth information can be found in the official spec: XML Path Language (XPath) 3.1 / 3.11.2 Arrays
Also, a new set of functions, specific to arrays, has been implemented. These functions resinde in the namespace http://www.w3.org/2005/xpath-functions/array, which, conventionally, is being prefixed with array and can be found referenced in here: XPath and XQuery Functions and Operators 3.1 / 17.3 Functions that Operate on Arrays

Categories