Error with using parameters of RestCypherQueryEngine - java

I use neo4j-rest-binding API to develop, but I face a problem when using parameters of RestCypherQueryEngine.
QueryResult<Map<String,Object>> result = engine.query("MATCH (n:{label}) RETURN n", MapUtil.map("label", label));
label is the parameter I assign in the map structure, but it has an error:
org.neo4j.rest.graphdb.RestResultException: Invalid input '{': expected whitespace or an identifier (line 1, column 10)
"MATCH (n:{label}) RETURN n"
^ at
SyntaxException
org.neo4j.cypher.internal.compiler.v2_0.parser.CypherParser$$anonfun$parse$1.apply(CypherParser.scala:51)
org.neo4j.cypher.internal.compiler.v2_0.parser.CypherParser$$anonfun$parse$1.apply(CypherParser.scala:41)
...
I can use another method to solve this problem:
QueryResult<Map<String,Object>> result = engine.query("MATCH (n:" + label +") RETURN n", null);
But I think the above method is not appropriate when I want to pass multiple parameters.

:{ is a syntactical error. As the exception tells you, Cypher expects an identifier after a colon - namely, the name of a label - and an identifier (as in most languages) cannot contain a bracket.
It sounds like you're confused about the difference between labels and parameters:
The following would be valid: MATCH (n:employee{name:"foo"}) Here, employee is a label. You can apply an arbitrary number of labels delimited by colons. {name:"foo"} is a parameter block - note that it contains both the field you want to match and the value. So, this query will return all nodes labelled employee with a name value of "foo". MATCH (n:employee:custodian{name:"foo"}) will give you all employees who are custodians named "foo".
If you want all nodes with a name value of "foo", use MATCH (n {name:"foo"}) (note the absence of a colon).
Edit (responding to your comment) There are two differences between your query and the one in the example you're referring to, start n=node({id}) return n is, obviously, a START clause, which do very different things and have different syntactical rules from MATCH clauses: The id in ({id)} is simply a value to look up in an index. In a MATCH clause, what goes inside a { } block are key-value pairs, as is explained above. Inside a parameter block (i.e. a set of braces), colons are used to separate keys from values. A colon outside the brackets in a MATCH clause are used to separate labels which are different different things entirely.
The second difference is that, if you look more closely at the START clause, there is a parenthesis separating the colon from the bracket. :{ is never okay, which is what your error message is telling you.

Related

Match custom pattern in regex multiple times

I am trying to parse a query which I need to modify to replace a specific property and its value with another property and different values. I am struggling to write a regex that will match the specify property and its value that I need.
Here are some examples to illustrate my point. test:property is the property name that we need to match.
Property with a single value: test:property:schema:Person
Property with multiple values (there is no limit on how many values there can be - this example uses 3): test:property:(schema:Person OR schema:Organization OR schema:Place)
Property with a single value in brackets: test:property:(schema:Person)
Property with another property in the query string (i.e. there are other parts of the string that I'm not interested in): test:property:schema:Person test:otherProperty:anotherValue
Also note that other combinations are possible such as other properties being before the property I need to capture, my property having multiple values with another property present in the query.
I want to match on the entire test:property section with each value captured within that match. Given the examples above these are the results I am looking for:
#
Match
Groups
1
test:property:schema:Person
schema:Person
2
test:property:(schema:Person OR schema:Organization OR schema:Place)
schema:Personschema:Organizationschema:Person
3
test:property:(schema:Person)
schema:Person
4
test:property:schema:Person
schema:Person
Note: #1 and #4 produce the same output. I wanted to illustrate that the rest of the string should be ignored (I only need to change the test:property key and value).
The pattern of schema:Person is defined as \w+\:\w+, i.e. one or more word characters, followed by a colon, followed by one or more word characters.
If we define the known parts of the string with names I think I can express what I want to match.
schema:Person - <TypeName> - note that the first part, schema in this case, is not fixed and can be different
test:property - <MatchProperty>
<MatchProperty>: // property name (which is known and the same - in the examples this is `test:property`) followed by a colon
( // optional open bracket
<TypeName>
(OR <TypeName>)* // optional additional TypeNames separated by an OR
) // optional close bracket
Every example I've found has had simple alphanumeric characters in the repeating section but my repeating pattern contains the colon which seems to be tripping me up. The closest I've got is this:
(test\:property:(?:\(([\w+\:\w+]+ [OR [\w+\:\w+]+)\))|[\w+\:\w+]+)
Which works okayish when there are no other properties (although the match for example #2 contains the entire property and value as the first group result, and a second group with the property value) but goes crazy when other properties are included.
Also, putting that regex through https://regex101.com/ I know it's not right as the backslash characters in the square brackets are being matched exactly. I started to have a go with capturing and non-capturing groups but got as far as this before giving up!
(?:(\w+\:\w+))(?:(\sOR\s))*(?:(\w+\:\w+))*
This isn't a complete solution if you want pure regex because there are some limitations to regex and Java regex in particular, but the regexes I came up with seem to work.
If you're looking to match the entire sequence, the following regex will work.
test:property:(?:\((\w+:\w+)(?:\sOR\s(\w+:\w+))*\)|(\w+:\w+))
Unfortunately, the repeated capture groups will only capture the last match, so in queries with multiple values (like example 2), groups 1 and 2 will be the first and last values (schema:Person and schema:Place). In queries without parentheses, the value will be in group 3.
If you know the maximum number of values, you could just generate a massive regex that will have enough groups, but this might not be ideal depending on your application.
The other regex to find values in groups of arbitrary length uses regex's positive lookbehind to match valid values. You can then generate an array of matches.
(?<=test:property:(?:(?:\((?:\w+:\w+\sOR\s)+)|\(?))\w+:\w+
The issue with this method is that it looks like Java lookbehind has some limitations, specifically, not allowing unbound or complex quantifiers. I'm not a Java person so I haven't tried things out for myself, but it seems like this wouldn't work either. If someone else has another solution, please post another answer!
With this in mind, I would probably suggest going with a combination regex + string parsing method. You can use regex to parse out the value or multiple values (separated by OR), then split the string to get your final values.
To match the entire part inside parentheses or the single value no parentheses, you can use this regex:
test:property:(?:\((\w+:\w+(?:\sOR\s\w+:\w+)*)\)|(\w+:\w+))
It's still split into two groups where one matches values with parentheses and the other matches values without (to avoid matching unpaired parentheses), but it should be usable.
If you want to play around with these regexes or learn more, here's a regexr: https://regexr.com/65kma

KNIME: Compare if One Column Contains a subset of another column

In Knime I am trying to compare if a value in one column is contained within another column. I tried to do this using "LIKE" in the Rule Engine but couldn't get the wildcards to work with a column input instead of a string. E.g.
For row1 I want to check if column 1, row 1 is within column 2, row 1
For row2 I want to check if column 1, row 2 is within column 2, row 2
Like is "ABC" contained within "test ABCtest"
Does the "LIKE" in Rule Engine only allow hard coded strings for comparison? Other ideas to achieve this? Thank you for the help!
The String Manipulation node with the regexMatcher can help here, though the result will be String (with values True/False by default), so further node will be required if for example a number is required (if different String, you can use the ?/: ternary operator like == "True" ? "when true" : join("when false it was because '", $columnReference$, "' was not found")).
You can use regexMatcher like this (\Q/\E helps to avoid treating the content in Reference column as a regular expression (except when it contains \E)):
regexMatcher($text$, join(".*?\\Q", $Reference$, "\\E.*+")) == "True" ? "vrai" : "faux"
Rule engine allows wildcards with LIKE operator but it does not allow wildcards combined with columns meaning the following will work fine:
$column1$ LIKE "*test*" => "1"
The following is allowed as well but will not work fine:
$column1$ LIKE "*$column2$*" => "1"
The reason is when you got double quotes $ is not recognized so you do not get the values from column2. Instead you get same string each time: "*$column2$*" which is not what you want.
Additionally you can use indexOf() function in String Manipulation or Column Expressions node that will return the first position of string value from column1 in column2. If not found the function will return -1. Follow it with Rule Engine node to add appropriate indication.

Shunting-yard functions

I am using the Shunting-Yard algorithm (https://en.wikipedia.org/wiki/Shunting-yard_algorithm) in a Java program in order to create a calculator. I am almost done, but I still have to implement functions. I have ran into a problem: I want the calculator to automatically multiply variables like x and y when put together - Example: calculator converts xy to x*y. Also, I want the calculator to convert (x)(y) to (x)*(y) and x(y) to x*(y). I have done all of this using the following code:
infix = infix.replaceAll("([a-zA-Z])([a-zA-Z])", "$1*$2");
infix = infix.replaceAll("([a-zA-Z])\\(", "$1*(");
infix = infix.replaceAll("\\)\\(", ")*(");
infix = infix.replaceAll("\\)([a-zA-Z])", ")*$1");
(In my calculator, variable names are always single characters.)
This works great right now, but when I implement functions this will, of course, not work. It will turn "sin(1)" into "s*i*n*(1)". How can I make this code do the multiplication converting only for operators, and not for functions?
Preprocessing the input to parse isn't a good way to implement what you want. The text replacement can't know what the parsing algorithm knows and you also lose the original input, which can be useful for printing helpful error messages.
Instead, you should decide on what to do according to the context. Keep the type of the previously parsed token wth a special type for the beginning of the input.
If the previous token was a value token – a number, a variable name or the closing brace of a subextression – and the current one is a value token, too, emit an extra multiplication operator.
The same logic can be used to decide whether a minus sign is a unary negation or a binary subtraction: It's a subtraction if the minus is found after a value token and a negation otherwise.
Your idea to convert x(y) to x * (y) will, of course, clash with function call syntax.
We can break this down into two parts. There is one rule for bracketed expressions and another for multiplications.
Rather than the wikipedia article, which is a deliberately simplified for explanatory purposes, I would follow a more details example like Parsing Expressions by Recursive Descent that deals with bracketed expressions.
This is the code I use for my parser which can work with implicit multiplication. I have multi-letter variable names and use a space to separate different variables so you can have "2 pi r".
protected void expression() throws ParseException {
prefixSuffix();
Token t = it.peekNext();
while(t!=null) {
if(t.isBinary()) {
pushOp(t);
it.consume();
prefixSuffix();
}
else if(t.isImplicitMulRhs()) {
pushOp(implicitMul);
prefixSuffix();
}
else
break;
t=it.peekNext();
}
while(!sentinel.equals(ops.peek())) {
popOp();
}
}
This require a few other functions.
I've used a separate tokenizing step which breaks the input into discrete tokens. The Tokens class has a number of methods, in particular Token.isBinary() test if the operator is a binary operator like +,=,*,/. Another method Token.isImplicitMulRhs() tests if the token can appear on the right hand side of an implicit multiplication, this will be true for numbers, variable names, and left brackets.
An Iterator<Token> is used for the input stream. it.peekNext() looks at the next token and it.consume() moves to the next token in the input.
pushOp(Token) pushes a token onto the operator stack and popOp removes one and . pushOp has the logic to handle the precedence of different operators. Popping operator if they have lower precedence
protected void pushOp(Token op)
{
while(compareOps(ops.peek(),op))
popOp();
ops.push(op);
}
Of particular note is implicitMul an artificial token with the same precedence as multiplication which is pushed onto the operator stack.
prefixSuffix() handles expressions which can be numbers and variables with optional prefix of suffix operators. This will recognise "2", "x", "-2", "x++" removing tokens from the input and added them to the output/operator stack as appropriate.
We can think of this routine in BNF as
<expression> ::=
<prefixSuffix> ( <binaryOp> <prefixSuffix> )* // normal binary ops x+y
| <prefixSuffix> ( <prefixSuffix> )* // implicit multiplication x y
Handling brackets is done in prefixSuffix(). If this detects a left bracket, it will then recursively call expression(). To detect the matching right bracket a special sentinel token is pushed onto the operator stack. When the right bracket is encountered in the input the main loop breaks, and all operators on the operator stack popped until the sentinel is encountered and control returned to prefixSuffix(). Code for this might be like
void prefixSuffix() {
Token t = it.peekNext();
if(t.equals('(')) {
it.consume(); // advance the input
operatorStack.push(sentinel);
expression(); // parse until ')' encountered
t = it.peekNext();
if(t.equals(')')) {
it.consume(); // advance the input
return;
} else throw Exception("Unmatched (");
}
// handle variable names, numbers etc
}
Another approach may be the use of tokens, in a similar way to how a parser work.
The first phase would be to convert the input text into a list of tokens, which are objects that represent both the type of entity found and its value.
For example you can have a variable token, with its value being the name of the variable ('x', 'y', etc.), a token for open or close parenthesis, etc.
Since, I assume, you know in advance the names of the functions that can be used by the calculator, you'll also have a function token, with its value being the function name.
So the output of the tokenizing phase differentiates between variables and functions.
Implementing this is not too hard, just always try to match function names first,
so "sin" will be recognized as a function and not as three variables.
Now the second phase can be to insert the missing multiplication operators. This will not be hard now, since you know you to just insert them between:
{VAR, RIGHT_PAREN} and {VAR, LEFT_PAREN, FUNCTION}
But never between FUNCTION and LEFT_PAREN.

How to nest a query with Neo4j?

I am trying to do a sort of nested Neo4j query in Java, which first labels a subset of nodes and then tries to match certain patterns among them. More specifically it is like combining 2 queries of this type:
1 - MATCH (n)-[r:RELATIONSHIP*1..3]->(m) set m:LABEL
2 - MATCH (p:LABEL)-[r2:RELATIONSHIP]->(q:OTHERLABEL) where r2.time<100 return p,r2,q
Is there a way I can merge these two query in only one using the Java function engine.execute() ?
'p' in query #2 will, in general, correspond to a superset of 'm' in query #1. If that is your intention, then the following should work. Notice that the 2 MATCH statements have no common variables, but a WITH is required by the Cypher syntax, so I arbitrarily picked the variable 'm' to pass to the second MATCH (even though it will be ignored).
MATCH (n)-[r:RELATIONSHIP*1..3]->(m)
SET m:LABEL
WITH m
MATCH (p:LABEL)-[r2:RELATIONSHIP]->(q:OTHERLABEL)
WHERE r2.time<100
RETURN p,r2,q;
If you intend 'm' and ''p' to be the exactly the same, then just replace '(p:LABEL)' with '(m)':
MATCH (n)-[r:RELATIONSHIP*1..3]->(m)
SET m:LABEL
WITH m
MATCH (m)-[r2:RELATIONSHIP]->(q:OTHERLABEL)
WHERE r2.time<100
RETURN m,r2,q;

Regex to return unique lines when pattern matched

I am parsing a log file and trying to match error statements. The part of the line I am matching "error CS" will apply to numerous lines some duplicates some not. Is there a way I can not return the duplicates. Using Java flavor of RegEx..
example: my simple regex returns
Class1.cs(16,27): error CS0117: 'string' does not contain a definition for 'empty'
Class1.cs(34,20): error CS0103: The name 'thiswworked' does not exist in the current context
Class1.cs(16,27): error CS0117: 'string' does not contain a definition for 'empty'
Class1.cs(34,20): error CS0103: The name 'thiswworked' does not exist in the current context
would like it to return:
Class1.cs(16,27): error CS0117: 'string' does not contain a definition for 'empty'
Class1.cs(34,20): error CS0103: The name 'thiswworked' does not exist in the current context
One solution is to match using your regexp and then put the line into a data structure like a set which deals with removing duplicates for you. At the end of parsing just print the contents of the set.
If you're concerned about order you could add to a map of some kind with the line as the key and the line number as the value (perhaps checking for a matching entry before inserting). If you sort by value you'll get a list of the first instance of a given line.
Technically speaking, with a regular expression, this is not possible. You need something more powerful.
Regular expressions are meant for matching regular languages. The pattern you are attempting to match is not regular.
You require the expression to remember some 'state', the previously matched errors, and regular expressions are not meant to handle this type of computation. A Turing Machine is capable of saving state. This is more along the lines of what you need. (Java will fit the bill nicely.)
This could be fairly easily solved by adding some extra logic into your log parser after you find all of the error lines.

Categories