How to use or implement arrays in XQuery? - java

Is there any built in support for array in XQuery? For example, if we want to implement
the simple java program in xquery how we would do it:
(I am not asking to translate the entire program into xquery, but just asking
how to implement the array in line number 2 of the below code to xquery? I am
using marklogic / xdmp functions also).
java.lang.String test = new String("Hello XQuery");
char[] characters = test.toCharArray();
for(int i = 0; i<characters.length; i++) {
if(character[i] == (char)13) {
character[i] = (char) 0x00;
}
}
Legend:
hex 0x00 dec 0 : null
hex 0x0d dec 13: carriage return
hex 0x0a dec 10: line feed
hex 0x20 dec 22: dquote

The problem with converting your sample code to XQuery is not the absence of support for arrays, but the fact that x00 is not a valid character in XML. If it weren't for this problem, you could express your query with the simple function call:
translate($input, '', '')
Now, you could argue that's cheating, it just happens so that there's a function that does exactly what you are trying to do by hand. But if this function didn't exist, you could program it in XQuery: there are sufficient primitives available for strings to allow you to manipulate them any way you want. If you need to (and it's rarely necessary) you can convert a string to a sequence of integers using the function string-to-codepoints(), and then take advantage of all the XQuery facilities for manipulating sequences.
The lesson is, when you use a declarative language like XQuery or XSLT, don't try to use the same low-level programming techniques you were forced to use in more primitive languages. There's usually a much more direct way of expressing the problem.

XQuery has built-in support for sequences. The function tokenize() (as suggested by #harish.ray) returns a sequence. You can also construct one yourself using braces and commas:
let $mysequence = (1, 2, 3, 4)
Sequences are ordered lists, so you can rely on that. That is slightly different from a node-set returned from an XPath, those usually are document-ordered.
On a side mark: actually, everything in XQuery is either a node-set or a sequence. Even if a function is declared to return one string or int, you can treat that returned value as if it is a sequence of one item. No explicit casting is necessary, for which there are no constructs in XQuery anyhow. Functions like fn:exists() and fn:empty() always work.
HTH!

Just for fun, here's how I would do this in XQuery if fn:translate did not exist. I think Michael Kay's suggestion would end up looking similar.
let $test := "Hello XQuery"
return codepoints-to-string(
for $c in string-to-codepoints($test)
return if ($c eq 32) then 44 else $c)
Note that I changed the transformation because of the problem he pointed: 0 is not a legal codepoint. So instead I translated spaces to commas.
With MarkLogic, another option is to use http://docs.marklogic.com/json:array and its associated functions. The json:set-item-at function would allow coding in a vaguely imperative style. Coding both variations might be a good learning exercise.

There are two ways to do this.
Firstly you can create an XmlResults object using
XmlManager.createResults(), and use XmlResults.add() to add your
strings to this. You can then use the XmlResults object to set the
value of a variable in XmlQueryContext, which can be used in your
query.
Example:
XmlResults values = XMLManager.createResults();
values.add(new XmlValue("value1"));
values.add(new XmlValue("value2"));
XmlQueryContext.setVariableValue("files", values);
The alternative is to split the string in XQuery. You
can do this using the tokenize() function, which works using a
regular expression to match the string separator.
http://www.w3.org/TR/xpath-functions/#func-tokenize
Thanks.

A little outlook: XQuery 3.1 will provide native support for arrays. See http://www.w3.org/TR/xquery-31/ for more details.

You can construct an array like this:
$myArray = tokenize('a b c d e f g', '\s')
// $myArray[3] -> c
Please note that the first index of this pseudo-array is 1 not 0!

Since the question "How to use or implement arrays in XQuery?" is being held generic (and thus shows up in search results on this topic), I would like to add a generic answer for future reference (making it a Community Wiki, so others may expand):
As Christian Grün has already hinted at, with XQuery 3.1 XQuery got a native array datatype, which is a subtype of the function datatype.
Since an array is a 'ordered list of values' and an XPath/XQuery sequence is as well, the first question, which may arise, is: "What's the difference?" The answer is simple: a sequence can not contain another sequence. All sequences are automatically flattened. Not so an array, which can be an array of arrays. Just like sequences, arrays in XQuery can also have any mix of any other datatype.
The native XQuery array datatype can be expressed in either of two ways: As [] or via array {}. The difference being, that, when using the former constructor, a comma is being considered a 'hard' comma, meaning that the following array consists of two members:
[ ("apples", "oranges"), "plums" ]
while the following will consist of three members:
array { ("apples", "oranges"), "plums" }
which means, that the array expression within curly braces is resolved to a flat sequence first, and then memberized into an array.
Since Array is a subtype of function, an array can be thought of as an anonymous function, that takes a single parameter, the numeric index. To get the third member of an array, named $foo, we thus can write:
$foo(3)
If an array contains another array as a member you can chain the function calls together, as in:
$foo(3)(5)
Along with the array datatype, special operators have been added, which make it easy to look up the values of an array. One such operator (also used by the new Map datatype) is the question mark followed by an integer (or an expression that evaluates to zero or more integers).
$foo?(3)
would, again, return the third member within the array, while
$foo?(3, 6)
would return the members 3 and 6.
The parenthesis can be left out, when working with literal integers. However, the parens are needed, to form the lookup index from a dynamic expression, like in:
$foo?(3 to 6)
here, the expression in the parens gets evaluated to a sequence of integers and thus the expression would return a sequence of all members from index position 3 to index position 6.
The asterisk * is used as wildcard operator. The expression
$foo?*
will return a sequence of all items in the array. Again, chaining is possible:
$foo?3?5
matches the previos example of $foo(3)(5).
More in-depth information can be found in the official spec: XML Path Language (XPath) 3.1 / 3.11.2 Arrays
Also, a new set of functions, specific to arrays, has been implemented. These functions resinde in the namespace http://www.w3.org/2005/xpath-functions/array, which, conventionally, is being prefixed with array and can be found referenced in here: XPath and XQuery Functions and Operators 3.1 / 17.3 Functions that Operate on Arrays

Related

How to store mathematical formula in MS SQL Server DB and interpret it using JAVA?

I have to give the user the option to enter in a text field a mathematical formula and then save it in the DB as a String. That is easy enough, but I also need to retrieve it and use it to do calculations.
For example, assume I allow someone to specify the formula of employee salary calculation which I must save in String format in the DB.
GROSS_PAY = BASIC_SALARY - NO_PAY + TOTAL_OT + ALLOWANCE_TOTAL
Assume that terms such as GROSS_PAY, BASIC_SALARY are known to us and we can make out what they evaluate to. The real issue is we can't predict which combinations of such terms (e.g. GROSS_PAY etc.) and other mathematical operators the user may choose to enter (not just the +, -, ×, / but also the radical sigh - indicating roots - and powers etc. etc.). So how do we interpret this formula in string format once where have retrieved it from DB, so we can do calculations based on the composition of the formula.
Building an expression evaluator is actually fairly easy.
See my SO answer on how to write a parser. With a BNF for the range of expression operators and operands you exactly want, you can follow this process to build a parser for exactly those expressions, directly in Java.
The answer links to a second answer that discusses how to evaluate the expression as you parse it.
So, you read the string from the database, collect the set of possible variables that can occur in the expression, and then parse/evaluate the string. If you don't know the variables in advance (seems like you must), you can parse the expression twice, the first time just to get the variable names.
as of Evaluating a math expression given in string form there is a JavaScript Engine in Java which can execute a String functionality with operators.
Hope this helps.
You could build a string representation of a class that effectively wraps your expression and compile it using the system JavaCompiler — it requires a file system. You can evaluate strings directly using javaScript or groovy. In each case, you need to figure out a way to bind variables. One approach would be to use regex to find and replace known variable names with a call to a binding function:
getValue("BASIC_SALARY") - getValue("NO_PAY") + getValue("TOTAL_OT") + getValue("ALLOWANCE_TOTAL")
or
getBASIC_SALARY() - getNO_PAY() + getTOTAL_OT() + getALLOWANCE_TOTAL()
This approach, however, exposes you to all kinds of injection type security bugs; so, it would not be appropriate if security was required. The approach is also weak when it comes to error diagnostics. How will you tell the user why their expression is broken?
An alternative is to use something like ANTLR to generate a parser in java. It's not too hard and there are a lot of examples. This approach will provide both security (users can't inject malicious code because it won't parse) and diagnostics.

Replacing Mathematical Operators with Variable Java

Currently trying to do a reverse polish calculator for one of my Uni homework tasks.
I have the program working fine when using a bunch of if/else statements to work out which operator was typed in and do the mathematical operation normally like num1+num2.
This is beyond the homework task we were set but what I'm trying to do now is replace the operator with a variable for example: "num1 + num2" would become "num1 variable num2" where variable equals "+" etc.
Is there a way to do this in Java?
Thanks in advance
Since you are interested in going beyond the scope of the training material, and assuming you've learned about interfaces already, I believe what you are looking for is a binary expression tree (that wikipedia article actually explains it good).
Basically, you create an interface Expression with a double compute() method. There will be two types of implementing classes:
Operand: Constant and Variable.
Operator: Plus, Minus, Multiply, Divide. Each will have two Expression fields: left and right.
Your text expression is then parsed into the expression tree:
// input: "num1 + num2 * 3"
// result expression tree will be built by parser using:
Expression a = new Variable("num1");
Expression b = new Variable("num2");
Expression c = new Constant(3);
Expression d = new Multiply(b, c);
Expression e = new Plus(a, d);
Map<String, Double> variables = /*defined elsewhere*/;
double result = e.compute(variables);
Your new assignment, should you choose to accept it, will be to write the expression classes and a parser to build the expression tree from a text expression.
Hope this will encourage you to go way beyond the training material, having some fun while playing.
First you can use a switch on String rather than if-then-else chain. Another way is to build a static final Map (E.g. HashMap) from String to Function. The strings are the operators. The Functions do the operation. In Java 8 you can give the functions as lambdas. I only have access through a phone right now so can't show code. Your question will be much better received if you add code showing what you mean.

How is the "empty string" sequence represented under the hood in Java?

Throughout my career I've often seen calls like this:
if( "".equals(foo) ) { //do stuff };
How is the empty string understood in terms of data in the lower-levels of Java?
Specifically, by "Lower-levels of Java" I'm referring to the actual contents of memory or some C/C++ construct being used to represent the "" sequence, rather than high-level implementations in Java.
I had previously checked the Java Language Specification which lead me to this, and noting that the "empty string" wasn't really given much more definition than that, this is then what led to the head-scratching.
I then ran javap on some various classes trying to tease out an answer through bytecode, but the behavior in regards to "How is the machine dealing with the sequence "" wasn't really any more clear. Having then excluded byte code and Java code I then posted the question here, hoping that someone would shed some light on the issue from a lower-level perspective.
There's no such thing as "the empty string character". A character is always a UTF-16 code unit, and there's no "empty" code unit. There's "an empty string" which is represented exactly the same way as any other string:
A char[] reference
An index into that char[]
A length
In this case, the length would be 0. The char[] reference could potentially be a reference to an empty char array, which could potentially be shared between all instance of String which have a length of 0.
(Code such as substring could be implemented by detecting 0-length requests and always returning the same reference to an empty string, but I'm not aware of implementations doing that.)

Is StringBuffer the same as Strings in Ruby and Symbols the same as regular Java strings?

I just started reading this book Eloquent Ruby and I have reached the chapter about Symbols in Ruby.
Strings in Ruby are mutable, which means each string allocate memory since the content can change, and even though the content is equal. If I need a mutable String in Java I would use StringBuffer. However since regular Java Strings are immutable one String object can be shared by multiple references. So if I had two regular Strings with the content of "Hello World", both references would point to the same object.
So is the purpose of Symbols in Ruby actually the same as "normal" String objects in Java? Is it a feature given to the programmer to optimize memory?
Is something of what I written here true? Or have I misunderstood the concept of Symbols?
Symbols are close to strings in Ruby, but they are not the equivalent to regular Java strings, although they, too, do share some commonalities such as immutability. But there is a slight difference - there is more than one way to obtain a reference to a Symbol (more on that later on).
In Ruby, it is entirely possible to convert the two back and forth. There is String#to_sym to convert a String into a Symbol and there is Symbol#to_s to convert a Symbol into a String. So what is the difference?
To quote the RDoc for Symbol:
The same Symbol object will be created for a given name or string for the duration of a program‘s execution, regardless of the context or meaning of that name.
Symbols are unique identifiers. If the Ruby interpreter stumbles over let's say :mysymbol for the first time, here is what happens: Internally, the symbol gets stored in a table if it doesn't exist yet (much like the "symbol table" used by parsers; this happens using the C function rb_intern in CRuby/MRI), otherwise Ruby will look up the existing value in the table and use that. After the symbol gets created and stored in the table, from then on wherever you refer to the Symbol :mysymbol, you will get the same object, the one that was stored in that table.
Consider this piece of code:
sym1 = :mysymbol
sym2 = "mysymbol".to_sym
puts sym1.equal?(sym2) # => true, one and the same object
str1 = "Test"
str2 = "Test"
puts str1.equal?(str2) # => false, not the same object
to notice the difference. It illustrates the major difference between Java Strings and Ruby Symbols. If you want object equality for Strings in Java you will only achieve it if you compare exactly the same reference of that String, whereas in Ruby it's possible to get the reference to a Symbol in multiple ways as you saw in the example above.
The uniqueness of Symbols makes them perfect keys in hashes: the lookup performance is improved compared to regular Strings since you don't have to hash your key explicitly as it would be required by a String, you can simply use the Symbol's unique identifier for the lookup directly. By writing :somesymbol you tell Ruby to "give me that one thing that you stored under the identifier 'somesymbol'". So symbols are your first choice when you need to uniquely identify things as in:
hash keys
naming or referring to variable, method and constant names (e.g. obj.send :method_name )
But, as Jim Weirich points out in the article below, Symbols are not Strings, not even in the duck-typing sense. You can't concatenate them or retrieve their size or get substrings from them (unless you convert them to Strings first, that is). So the question when to use Strings is easy - as Jim puts it:
Use Strings whenever you need … umm … string-like behavior.
Some articles on the topic:
Ruby Symbols.
Symbols are not immutable strings
13 Ways of looking at a Ruby Symbol
The difference is that Java Strings need not point to the same object if they contain the same text. When declaring constant strings in your code, this normally is the case since the compiler will put it in the constant pool.
However, if you create a String dynamically at runtime in Java, two Strings can perfectly point to different objects and still contain the same literal text. You can however force this by internalizing the String objects (calling String.intern(), see Java API
A nice example can be found here.

Most elegant isNumeric() solution for java

I'm porting a small snippet of PHP code to java right now, and I was relying on the function is_numeric($x) to determine if $x is a number or not. There doesn't seem to be an equivalent function in java, and I'm not satisfied with the current solutions I've found so far.
I'm leaning toward the regular expression solution found here: http://rosettacode.org/wiki/Determine_if_a_string_is_numeric
Which method should I use and why?
Note that the PHP isNumeric() function will correctly determine that hex and scientific notation are numbers, which the regex approach you link to will not.
One option, especially if you are already using Apache Commons libraries, is to use NumberUtils.isNumber(), from Commons-Lang. It will handle the same cases that the PHP function will handle.
Have you looked into using StringUtils library?
There's a isNumeric() function which might be what you're looking for.
(Note that "" would be evaluated to true)
It's usually a bad idea to have a number in a String. If you want to use this number then parse it and use it as a numeric. You shouldn't need to "check" if it's a numeric, either you want to use it as a numeric or not.
If you need to convert it, then you can use every parser from Integer.parseInt(String) to BigDecimal(String)
If you just need to check that the content can be seen as a numeric then you can get away with regular expressions.
And don't use the parseInt if your string can contain a float.
Optionally you can use a regular expression as well.
if (theString.matches("((-|\\+)?[0-9]+(\\.[0-9]+)?)+")))
return true;
return false;
Did you try Integer.parseInt()? (I'm not sure of the method name, but the Integer class has a method that creates an Integer object from strings). Or if you need to handle non-integer numbers, similar methods are available for Double objects as well. If these fail, an exception is thrown.
If you need to parse very large numbers (larger than int/double), and don't need the exact value, then a simple regex based method might be sufficient.
In a strongly typed language, a generic isNumeric(String num) method is not very useful. 13214384348934918434441 is numeric, but won't fit in most types. Many of those where is does fit won't return the same value.
As Colin has noted, carrying numbers in Strings withing the application is not recommended. The isNumberic function should only be applicable for input data on interface methods. These should have a more precise definition than isNumeric. Others have provided various solutions. Regular expressions can be used to test a number of conditions at once, including String length.
Just use
if((x instanceof Number)
//if checking for parsable number also
|| (x instanceof String && x.matches("((-|\+)?[0-9]+(\.[0-9]+)?)+"))
){
...
}
//---All numeric types including BigDecimal extend Number

Categories