I use REST API to perform a query and I have performance issue when I use parameters.
{
"statements" : [ {
"statement" : "MATCH (n:NetworkElement)-[:Attribute]-(r:Realm) WHERE n.tag IN {tags} RETURN r.name, collect(n.tag)",
"parameters" : {
"tags" : [
"tag1",
"tag2",
"tag3", ...]
}
} ]
}
This query take 5s. This is very long.
I try now to include the tags arry inside the statement :
{
"statements" : [ {
"statement" : "MATCH (n:NetworkElement)-[:Attribute]-(r:Realm) WHERE n.tag IN [\"tag1\",\"tag2\",\"tag3\", ...] RETURN r.name, collect(n.tag)",
"parameters" : {}
} ]
}
And now the query take only 40ms ?
Someone can explain me ? Give me some optimization way ?
Thanks in advance
Edit 2015-05-27
After test with 2.2.2 the problem occur but in different condition :
I run this query without parameters :
{
"statements": [
{
"statement" : "PROFILE MATCH (ne:NetworkElement {_type:'interface'})-[:Connect*1..]->(s:NetworkElement) WHERE s.tag IN [\"mytag\"] RETURN s.tag, collect(ne.tag)",
"parameters" : {}
}
]
}
The query execute in ~100ms without cache, the plan is :
{
"root":{
"operatorType":"EagerAggregation",
"DbHits":1468,
"Rows":1,
"version":"CYPHER 2.2",
"KeyNames":"s.tag",
"EstimatedRows":0,
"planner":"COST",
"identifiers":[
"collect(ne.tag)",
"s.tag"
],
"children":[
{
"operatorType":"Projection",
"LegacyExpression":"ne",
"Rows":734,
"DbHits":1468,
"EstimatedRows":0,
"identifiers":[
" UNNAMED46",
"ne",
"s",
"s.tag"
],
"children":[
{
"operatorType":"Filter",
"LegacyExpression":"(ne:NetworkElement AND ne._type == { AUTOSTRING0})",
"Rows":734,
"DbHits":3402,
"EstimatedRows":0,
"identifiers":[
" UNNAMED46",
"ne",
"s"
],
"children":[
{
"operatorType":"VarLengthExpand(All)",
"ExpandExpression":"(s)-[ UNNAMED46:Connect*]->(ne)",
"Rows":1134,
"DbHits":2269,
"EstimatedRows":0,
"identifiers":[
" UNNAMED46",
"ne",
"s"
],
"children":[
{
"operatorType":"NodeUniqueIndexSeek",
"Index":":NetworkElement(tag)",
"Rows":1,
"DbHits":1,
"EstimatedRows":0.9999999999971109,
"identifiers":[
"s"
],
"children":[
]
}
]
}
]
}
]
}
]
}
}
Now I run the same query with parameters :
{
"statements": [
{
"statement" : "PROFILE MATCH (ne:NetworkElement)-[:Connect*1..]->(s:NetworkElement) WHERE ne._type = {endNeType} AND s.tag IN {startTags} RETURN s.tag, collect(ne.tag)",
"parameters" : {
"endNeType" : "interface",
"startTags" : ["mytag"]
}
}
]
}
The query take 980ms to execute and the plan is not the same :
{
"root":{
"operatorType":"EagerAggregation",
"DbHits":1468,
"Rows":1,
"version":"CYPHER 2.2",
"KeyNames":"s.tag",
"EstimatedRows":0,
"planner":"COST",
"identifiers":[
"collect(ne.tag)",
"s.tag"
],
"children":[
{
"operatorType":"Projection",
"LegacyExpression":"ne",
"Rows":734,
"DbHits":1468,
"EstimatedRows":0,
"identifiers":[
" UNNAMED26",
"ne",
"s",
"s.tag"
],
"children":[
{
"operatorType":"Filter",
"LegacyExpression":"(any(-_-INNER-_- in {startTags} where s.tag == -_-INNER-_-) AND s:NetworkElement)",
"Rows":734,
"DbHits":104427,
"EstimatedRows":0,
"identifiers":[
" UNNAMED26",
"ne",
"s"
],
"children":[
{
"operatorType":"VarLengthExpand(All)",
"ExpandExpression":"(ne)-[ UNNAMED26:Connect*]->(s)",
"Rows":34809,
"DbHits":105113,
"EstimatedRows":0,
"identifiers":[
" UNNAMED26",
"ne",
"s"
],
"children":[
{
"operatorType":"NodeIndexSeek",
"Index":":NetworkElement(_type)",
"Rows":35495,
"DbHits":35496,
"EstimatedRows":0.9999999999971109,
"identifiers":[
"ne"
],
"children":[
]
}
]
}
]
}
]
}
]
}
}
I have an unique constraint on NetworkElement.tag and index on NetworkElement._type.
If you have more than 1 potential index usage, the plan might look different, you can force the other (or both) index lookups with USING INDEX
MATCH (ne:NetworkElement)-[:Connect*1..]->(s:NetworkElement)
USING INDEX ne:NetworkElement(_type)
USING INDEX s:NetworkElement(tag)
WHERE ne._type = {endNeType} AND s.tag IN {startTags}
RETURN s.tag, collect(ne.tag)
Related
I am working on transforming a complex json using JOLT.
Input JSON:
{ "data":
[
{
"fieldname": "Name",
"fieldvalue": [ "John Doe" ]
},
{ "fieldname": "Title",
"fieldvalue": [ "Manager" ]
},
{ "fieldname": "Company",
"fieldvalue": [ "Walmart" ]
}
] }
Expected Output:
{
"finalPayload":{
"PI":{
"EmpName":"John Doe",
"EmpRole":"Manager"
},
"Company":"Walmart"
}
}
I am unable to understand how to access and assign "fieldvalue" in output based on "fieldname". Please help me with the JOLT spec.
Note: The order of name, title and company in input JSON will be jumbled and random meaning its not mandatory that under "data" array first object will be related to "Name" only.
Hi hope this helps you in resolving your issue.
You can have condition in Jolt too, by going inside the variable and checking the fieldname.
[
{
"operation": "shift",
"spec": {
"data": {
"*": {
"fieldname": {
"Name": {
"#(2,fieldvalue)": "finalPayload.PI.EmpName"
},
"Title": {
"#(2,fieldvalue)": "finalPayload.PI.EmpRole"
},
"Company": {
"#(2,fieldvalue)": "finalPayload.Company"
}
}
}
}
}
},
{
"operation": "cardinality",
"spec": {
"finalPayload": {
"PI": {
"EmpName": "ONE",
"EmpRole": "ONE"
},
"Company": "ONE"
}
}
}
]
May I introduce an alternative library to solve the issue.
https://github.com/octomix/josson
implementation 'com.octomix.josson:josson:1.3.21'
-------------------------------------------------
Josson josson = Josson.fromJsonString(
"{\"data\":[{\"fieldname\":\"Name\",\"fieldvalue\":[\"JohnDoe\"]},{\"fieldname\":\"Title\",\"fieldvalue\":[\"Manager\"]},{\"fieldname\":\"Company\",\"fieldvalue\":[\"Walmart\"]}]}");
JsonNode node = josson.getNode(
"map(" +
" finalPayload: map(" +
" PI: map(" +
" EmpName: data[fieldname='Name'].fieldvalue[0]," +
" EmpRole: data[fieldname='Title'].fieldvalue[0]" +
" )," +
" Company: data[fieldname='Company'].fieldvalue[0]" +
" )" +
")");
System.out.println(node.toPrettyString());
Output
{
"finalPayload" : {
"PI" : {
"EmpName" : "JohnDoe",
"EmpRole" : "Manager"
},
"Company" : "Walmart"
}
}
Situation
I have a JSON
I'm trying to grab every element in an array that has some particular nested objects. The hard part is that some of these objects are nested at different depths.
I'm using JayWay JsonPath https://github.com/json-path/JsonPath, and my code works exactly like https://jsonpath.herokuapp.com
This is to use on our platform, https://dashdash.com - a spreadsheet with integrations for known web services (and your private APIs too).
Particular case (testable)
Consider the following source JSON, I want to return only the array elements that have nested objects B, C and G. G is on a different depth than B and C.
Below you can see the source and 2 options for the return.
source JSON
[
{
"A":"val1",
"B":"val2",
"C":"val3",
"D":{
"E":[
{
"F":"val4"
}
],
"G":[
{
"H":"val5",
"I":"val6",
"J":"val7"
}
]
}
},
{
"A":"val8",
"B":"val9",
"C":"val10",
"D":{
"E":[
{
"F":"val11"
}
],
"G":[
{
"H":"val12",
"I":"val13",
"J":"val14"
}
]
}
},
{
"A":"val15",
"B":"val16"
},
{
"A":"val8",
"B":"val9",
"C":"val10",
"D":{
"E":[
{
"F":"val11"
}
]
}
}
]
Expected return Option 1.
[
{
"B":"val2",
"C":"val3",
"G":[
{
"H":"val5",
"I":"val6",
"J":"val7"
}
]
},
{
"B":"val9",
"C":"val10",
"G":[
{
"H":"val12",
"I":"val13",
"J":"val14"
}
]
}
]
Expected return Option 2.
[
{
"B":"val2",
"C":"val3",
"D":{
"E":[
{
"F":"val4"
}
],
"G":[
{
"H":"val5",
"I":"val6",
"J":"val7"
}
]
}
},
{
"B":"val9",
"C":"val10",
"D":{
"E":[
{
"F":"val11"
}
],
"G":[
{
"H":"val12",
"I":"val13",
"J":"val14"
}
]
}
}
]
Where I am
I can extract all the array elements that have B,C and D, with the query $..['B','C','D']
I have tried to extract B, C and G, but all the following queries fail:
$..['B','C','G']: returns null.
$..['B','C',['D'].['G']]: returns only the objects inside G.
Again, I'm using JayWay JsonPath https://github.com/json-path/JsonPath, and my code works exactly like https://jsonpath.herokuapp.com.
Thanks in advance
You can solve this problem setting the JayWay to DEFAULT_PATH_LEAF_TO_NULL configuration (as decribed on oficial documentation: https://github.com/json-path/JsonPath) and after this apply a null comparation evaluation:
like this:
$.[?(#.A != null && #.B != null && #.D != null && #.D.G != null)]
or this:
$.[?((#.A != null && #.B != null) && ((#.D != null && #.D.G != null) || (#.G != null)))]
For set DEFAULT_PATH_LEAF_TO_NULL you should change you default configuration:
Configuration conf = Configuration.defaultConfiguration();
Configuration conf2 = conf.addOptions(Option.DEFAULT_PATH_LEAF_TO_NULL);
Note: If you are using a legacy version of the jayway the comparison operator could not work correctly, to get more information see https://code.google.com/archive/p/json-path/issues/27
I tested this solution and worked fine for me:
Test did on https://jsonpath.herokuapp.com/ with the following input:
[
{
"A":"val1",
"B":"val2",
"C":"val3",
"D":{
"E":[
{
"F":"val4"
}
],
"G":[
{
"H":"val5",
"I":"val6",
"J":"val7"
}
]
}
},
{
"A":"val8",
"B":"val9",
"C":"val10",
"D":{
"E":[
{
"F":"val11"
}
],
"G":[
{
"H":"val12",
"I":"val13",
"J":"val14"
}
]
}
},
{
"A":"val15",
"B":"val16"
},
{
"A":"val8",
"B":"val9",
"C":"val10",
"D":{
"E":[
{
"F":"val11"
}
]
}
}
]
and the result was:
[
{
"A" : "val1",
"B" : "val2",
"C" : "val3",
"D" : {
"E" : [
{
"F" : "val4"
}
],
"G" : [
{
"H" : "val5",
"I" : "val6",
"J" : "val7"
}
]
}
},
{
"A" : "val8",
"B" : "val9",
"C" : "val10",
"D" : {
"E" : [
{
"F" : "val11"
}
],
"G" : [
{
"H" : "val12",
"I" : "val13",
"J" : "val14"
}
]
}
}
]
See the evidence and note that returning null option is set to true
Let me know if you need any further assistance on this.
I've been trying some different approaches and I think a simpler expression does the trick:
$.*[?(#.B && #.C && #.D.G)]
This doesn't need any special config other than default (according to experiment done on https://jsonpath.herokuapp.com and yields the following result:
[
{
"A" : "val1",
"B" : "val2",
"C" : "val3",
"D" : {
"E" : [
{
"F" : "val4"
}
],
"G" : [
{
"H" : "val5",
"I" : "val6",
"J" : "val7"
}
]
}
},
{
"A" : "val8",
"B" : "val9",
"C" : "val10",
"D" : {
"E" : [
{
"F" : "val11"
}
],
"G" : [
{
"H" : "val12",
"I" : "val13",
"J" : "val14"
}
]
}
}
]
What do you think?
I have a projection field computed from some conditions in the current document. The native mongo query works fine. But I cant implement the query in java driver 3.4. Only java driver 3.4 syntax is relevant.
The projection code for field result from switch is:
"SITUACAO": {
"$switch" : {
"branches": [
{ case: {"$eq": ["$ID_STATUSMATRICULA", 0]},
then: {
"$switch" : {
"branches": [
{ case: {"$and": [{"$eq": ["$NR_ANDAMENTO", 0 ] },
{"$eq": ["$ID_STATUSMATRICULA", 0]} ] }, then: "NAOINICIADO" },
{ case: {"$and": [{"$gt": ["$NR_ANDAMENTO", 0]},
{"$lte": ["$NR_ANDAMENTO", 100]},
{"$eq": ["$ID_STATUSMATRICULA", 0]} ] }, then: "EMANDAMENTO" }
],
"default": "--matriculado--"
}
}
},
{ case: {"$eq": ["$ID_STATUSMATRICULA", 1]},
then: {
"$switch" : {
"branches": [
{ case: {"$and": [ {"$eq": ["$ID_STATUSMATRICULA", 1]},
{"$in": ["$ID_STATUSAPROVEITAMENTO", [1] ]} ] }, then: "APROVADO" },
{ case: {"$and": [ {"$eq": ["$ID_STATUSMATRICULA", 1]},
{"$in": ["$ID_STATUSAPROVEITAMENTO", [2] ]} ] }, then: "REPROVADO" },
{ case: {"$and": [{"$eq": ["$ID_STATUSMATRICULA", 1]},
{"$in": ["$ID_STATUSAPROVEITAMENTO", [0] ]} ] }, then: "PENDENTE" },
{ case: {"$and": [ {"$eq": ["$ID_STATUSMATRICULA", 1]},
{"$in": ["$ID_STATUSAPROVEITAMENTO", [1,2] ]} ] }, then: "CONCLUIDO" }
],
"default": "--concluido--"
}
}
}
],
"default": "--indefinida--"
}
}
The part around $and inside case statments I can draw like this:
List<Document> docs = new ArrayList<>();
docs.add( new Document("$eq", asList("$NR_ANDAMENTO", 0)) );
docs.add( new Document("$eq", asList("$ID_STATUSMATRICULA", 1)) );
Document doc = new Document("$and", docs);
but, the structure $switch / branches[] / case ... is dificult to find the way to write.
Anybody have an example like this or some idea for write this ?
Thanks
I am so tied for split the data for my expectation output. But i could not able to got it. I tried all the Filter and Tokenizer.
I Have Updated setting in elastic search as give below.
{
"settings": {
"analysis": {
"filter": {
"filter_word_delimiter": {
"preserve_original": "true",
"type": "word_delimiter"
}
},
"analyzer": {
"en_us": {
"tokenizer": "keyword",
"filter": [ "filter_word_delimiter","lowercase" ]
}
}
}
}
}
Executed Queries
curl -XGET "XX.XX.XX.XX:9200/keyword/_analyze?pretty=1&analyzer=en_us" -d 'DataGridControl'
Hits value
{
"tokens" : [ {
"token" : "datagridcontrol"
"start_offset" : 0,
"end_offset" : 16,
"type" : "word",
"position" : 1
}, {
"token" : "data",
"start_offset" : 0,
"end_offset" : 4,
"type" : "word",
"position" : 1
}, {
"token" : "grid",
"start_offset" : 4,
"end_offset" : 8,
"type" : "word",
"position" : 2
}, {
"token" : "control",
"start_offset" : 9,
"end_offset" : 16,
"type" : "word",
"position" : 3
} ]
}
Expectation Result like ->
DataGridControl
DataGrid
DataControl
Data
grid
control
What type of tokenizer and Filter add to index setting.
Any help ?
Try this:
{
"settings": {
"analysis": {
"filter": {
"filter_word_delimiter": {
"type": "word_delimiter"
},
"custom_shingle": {
"type": "shingle",
"token_separator":"",
"max_shingle_size":3
}
},
"analyzer": {
"en_us": {
"tokenizer": "keyword",
"filter": [
"filter_word_delimiter",
"custom_shingle",
"lowercase"
]
}
}
}
}
}
and let me know if it gets you any closer.
I'm trying to combine mutiple queries in elasticsearch using a boolean query but the result is not what I'm expecting. For example:
If I have the following documents (among others):
DOC 1:
{
"name":"Iphone 5",
"product_suggestions":{
"input":[
"iphone 5",
"apple"
]
},
"description":"Iphone 5 - The almost last version",
"brand":"Apple",
"brand_facet":"Apple",
"state_id":"2",
"user_state_description":"Almost New",
"product_type_id":"1",
"current_price":350,
"finish_date":"2014/06/20 14:12",
"finish_date_ms":1403273520
}
DOC 2:
{
"name":"Apple II Lisa",
"product_suggestions":{
"input":[
"apple ii lisa",
"apple"
]
},
"description":"Make a offer and I Apple II Lisa!!",
"brand":"Apple",
"brand_facet":"Apple",
"state_id":"2",
"user_state_description":"Used",
"product_type_id":"1",
"current_price":150,
"finish_date":"2014/06/15 16:12",
"finish_date_ms":1402848720
}
DOC 3:
{
"name":"Iphone 5s",
"product_suggestions":{
"input":[
"iphone 5s",
"apple"
]
},
"description":"Iphone 5s 32Gb like new with a few scratches bla bla bla",
"brand":"Apple",
"brand_facet":"Apple",
"state_id":"1",
"user_state_description":"New",
"product_type_id":"2",
"current_price":510.1,
"finish_date":"2014/06/10 14:12",
"finish_date_ms":1402409520
}
DOC 4:
{
"name":"Iphone 4s",
"product_suggestions":{
"input":[
"iphone 4s",
"apple"
]
},
"description":"Iphone 4s 16Gb Mint conditions and unlocked to all network",
"brand":"Apple",
"brand_facet":"Apple",
"state_id":"1",
"user_state_description":"Almost New",
"product_type_id":"2",
"current_price":385,
"finish_date":"2014/06/12 16:12",
"finish_date_ms":1402589520
}
And if I run the following query (Get all documents and facets with the keyword "Apple" that the finish_date_ms is bigger than 1402869581)
{
"from" : 1,
"size" : 20,
"query" : {
"bool" : {
"must" : {
"query_string" : {
"query" : "apple",
"default_operator" : "and",
"analyze_wildcard" : true
}
},
"must_not" : {
"range" : {
"finish_date_ms" : {
"from" : null,
"to" : 1402869581,
"include_lower" : true,
"include_upper" : false
}
}
}
}
},
"facets" : {
"brand" : {
"terms" : {
"field" : "brand_facet",
"size" : 10
}
},
"product_type_id" : {
"terms" : {
"field" : "product_type_id",
"size" : 10
}
},
"state_id" : {
"terms" : {
"field" : "state_id",
"size" : 10
}
}
}
}
This returns:
{
"took":5,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":0.18392482,
"hits":[
]
},
"facets":{
"brand":{
"_type":"terms",
"missing":0,
"total":1,
"other":0,
"terms":[
{
"term":"Apple",
"count":1
}
]
},
"product_type_id":{
"_type":"terms",
"missing":0,
"total":1,
"other":0,
"terms":[
{
"term":1,
"count":1
}
]
},
"state_id":{
"_type":"terms",
"missing":0,
"total":1,
"other":0,
"terms":[
{
"term":2,
"count":1
}
]
}
}
}
And should return only the document DOC1. If I remove the range query, returns all the documents that has Apple word. If I remve the "term" query then n document is returns, so I presume the problem is in the range query.
Can anyone point me in the right direction with this?
One other important thing, all this query is to be implemented in java (if this help).
Thanks!
(sory for this huge post)
I found my mistake. (newbie mistake to be honest)
The problem was not in the range query but in the begging of the Json: The from field is set to 1 but the result is only one record so this should be 0!!
Thanks for everything!!