I want to read objects from the main array filtering by refPath using JsonPath Jayway Java implementation.
My input looks like this:
[
{
"2be3d660-cab0-4db8-83b9-1baf212270c5" : {
"refPath" : [
"e0586818-ba2c-4b65-afec-3c48d817b584",
"06c089a6-4de0-43d3-8dc7-181addf4c933",
"d5413a18-ac33-426c-982d-bb25ce4e4bf6"
],
"elementId" : "12c5750e-9753-43f1-8987-9dfc3a830bbe",
"modified" : false
},
"191b1bab-c269-495f-ac4f-8b4d30df95a1" : {
"refPath" : [
"e0586818-ba2c-4b65-afec-3c48d817b584",
"f7df7cff-bf6d-49da-bc44-90d61f233d3b"
],
"elementId" : "04691514-566b-47ef-8f69-e31884bde7b2",
"modified" : false
},
"6a2acd79-135f-4688-9219-158f91d9c6cf" : {
"refPath" : [
"e0586818-ba2c-4b65-afec-3c48d817b584",
"f5177f79-e2f1-4419-b46a-7d4cc1c4fae5"
],
"elementId" : "04691514-566b-47ef-8f69-e31884bde7b2",
"modified" : false
}
}
]
and I want to find all objects containing these two refPath values: "e0586818-ba2c-4b65-afec-3c48d817b584" and "06c089a6-4de0-43d3-8dc7-181addf4c933"
So my expected result from JsonPath looks like:
[
{
"2be3d660-cab0-4db8-83b9-1baf212270c5" : {
"refPath" : [
"e0586818-ba2c-4b65-afec-3c48d817b584",
"06c089a6-4de0-43d3-8dc7-181addf4c933",
"d5413a18-ac33-426c-982d-bb25ce4e4bf6"
],
"elementId" : "12c5750e-9753-43f1-8987-9dfc3a830bbe",
"modified" : false
}
}
]
Even with if I only try to find "e0586818-ba2c-4b65-afec-3c48d817b584", I get an error message "Could not determine value type".
Does anybody have an idea how the JsonPath expression must look like for this?
Use the subsetof filter operator.
$[*][*][?(['e0586818-ba2c-4b65-afec-3c48d817b584','06c089a6-4de0-43d3-8dc7-181addf4c933'] subsetof #.refPath)]
Output will not include the key 2be3d660-cab0-4db8-83b9-1baf212270c5
[
{
"refPath" : [
"e0586818-ba2c-4b65-afec-3c48d817b584",
"06c089a6-4de0-43d3-8dc7-181addf4c933",
"d5413a18-ac33-426c-982d-bb25ce4e4bf6"
],
"elementId" : "12c5750e-9753-43f1-8987-9dfc3a830bbe",
"modified" : false
}
]
Having a document with this format :
"_id" : ObjectId("59ce3bb32708c95ee2168e2f"),
"document1" : [
{
"value" : "doc1A"
},
{
"value" : "doc1B"
},
{
"value" : "doc1C"
},
{
"value" : "doc1D"
},
{
"value" : "doc1E"
},
{
"value" : "doc1F"
}
],
"document2" : [
{
"value" : "doc2A"
},
{
"value" : "doc2B"
},
{
"value" : "doc2C"
},
{
"value" : "doc2D"
},
"metric1" :0.0,
"metric2" : 0.0
]
}
I need to group by the concatenation of the document1 and document 2 values and perform some calculs on it in Aggregation framework at Java.
I can do group(document1,document2) but I'll get as an _id an array so I want to get it as a concatenation and as :
doc1A (doc2A) / doc1A (doc2B) / doc1A (doc2C) ...
Do you have any idea ?
A few days ago I faced with the strange behavior of geo search in Elasticsearch.
I use AWS managed ES 5.5, obviously over REST interface.
Assume we have 200k objects with location info represented as the point only. I use geo search to find the points within multiple polygons. They are shown on the image below. Coordinates were extracted from final request to the ES.
The request is built using official Java High-level REST client. The request query will be attached below.
I want to search for all objects within at least one polygon.
Here is the query (real fields names and values were replaced by stub, Except location and locationPoint.coordinates)
{
"size" : 20,
"query" : {
"constant_score" : {
"filter" : {
"bool" : {
"must" : [
{
"terms" : {
"field1" : [
"a",
"b",
"c",
"d",
"e",
"f"
],
"boost" : 1.0
}
},
{
"term" : {
"field2" : {
"value" : "q",
"boost" : 1.0
}
}
},
{
"range" : {
"field3" : {
"from" : "10",
"to" : null,
"include_lower" : true,
"include_upper" : true,
"boost" : 1.0
}
}
},
{
"range" : {
"field4" : {
"from" : "10",
"to" : null,
"include_lower" : true,
"include_upper" : true,
"boost" : 1.0
}
}
},
{
"geo_shape" : {
"location" : {
"shape" : {
"type" : "geometrycollection",
"geometries" : [
{
"type" : "multipolygon",
"orientation" : "right",
"coordinates" : [
[
// coords here
]
]
},
{
"type" : "polygon",
"orientation" : "right",
"coordinates" : [
[
// coords here
]
]
},
{
"type" : "polygon",
"orientation" : "right",
"coordinates" : [
[
// coords here
]
]
},
{
"type" : "polygon",
"orientation" : "right",
"coordinates" : [
[
// coords here
]
]
}
]
},
"relation" : "intersects"
},
"ignore_unmapped" : false,
"boost" : 1.0
}
}
]
}
},
"boost" : 1.0
}
},
"_source" : {
"includes" : [
"field1",
"field2",
"field3",
"field4",
"field8"
],
"excludes" : [ ]
},
"sort" : [
{
"field1" : {
"order" : "desc"
}
}
],
"aggregations" : {
"agg1" : {
"terms" : {
"field" : "field1",
"size" : 10000,
"min_doc_count" : 1,
"shard_min_doc_count" : 0,
"show_term_doc_count_error" : false,
"order" : [
{
"_count" : "desc"
},
{
"_term" : "asc"
}
]
}
},
"agg2" : {
"terms" : {
"field" : "field2",
"size" : 10000,
"min_doc_count" : 1,
"shard_min_doc_count" : 0,
"show_term_doc_count_error" : false,
"order" : [
{
"_count" : "desc"
},
{
"_term" : "asc"
}
]
}
},
"agg3" : {
"terms" : {
"field" : "field3",
"size" : 10000,
"min_doc_count" : 1,
"shard_min_doc_count" : 0,
"show_term_doc_count_error" : false,
"order" : [
{
"_count" : "desc"
},
{
"_term" : "asc"
}
]
}
},
"agg4" : {
"terms" : {
"field" : "field4",
"size" : 10000,
"min_doc_count" : 1,
"shard_min_doc_count" : 0,
"show_term_doc_count_error" : false,
"order" : [
{
"_count" : "desc"
},
{
"_term" : "asc"
}
]
}
},
"agg5" : {
"terms" : {
"field" : "field5",
"size" : 10000,
"min_doc_count" : 1,
"shard_min_doc_count" : 0,
"show_term_doc_count_error" : false,
"order" : [
{
"_count" : "desc"
},
{
"_term" : "asc"
}
]
}
},
"agg6" : {
"terms" : {
"field" : "field6",
"size" : 10000,
"min_doc_count" : 1,
"shard_min_doc_count" : 0,
"show_term_doc_count_error" : false,
"order" : [
{
"_count" : "desc"
},
{
"_term" : "asc"
}
]
}
},
"agg7" : {
"terms" : {
"field" : "field7",
"size" : 10000,
"min_doc_count" : 1,
"shard_min_doc_count" : 0,
"show_term_doc_count_error" : false,
"order" : [
{
"_count" : "desc"
},
{
"_term" : "asc"
}
]
}
},
"agg8" : {
"terms" : {
"field" : "field8",
"size" : 10000,
"min_doc_count" : 1,
"shard_min_doc_count" : 0,
"show_term_doc_count_error" : false,
"order" : [
{
"_count" : "desc"
},
{
"_term" : "asc"
}
]
}
},
"map_center" : {
"geo_centroid" : {
"field" : "locationPoint.coordinates"
}
},
"map_bound" : {
"geo_bounds" : {
"field" : "locationPoint.coordinates",
"wrap_longitude" : true
}
}
}
}
Note, that field location is mapped as geo_shape and field location.coordinates is mapped as geo_point.
So the problem is next. Below the results (hits count) of requests are presented. Only polygons are changing.
# Polygons Hits count
1) 1,2,3,4 5565
2) 1 4897
3) 3,4 75
4) 2 9
5) 1,3,4 5543
6) 1,2 5466
7) 2,3,4 84
So, if I add results of polygon 1st with 2,3,4 polygons I will not obtain the number as it was in full request.
For example, #1 != #2 + #7, also #1 != #5 + #4, but #7 == #4 + #3
I cannot understand whether it is the issue in this request or expected behavior or even bug in ES.
Can anyone help me to understand the logic of such ES behavior or point to the solution?
Thanks!
After a short conversation with Elasticsearch team member, we come up to AWS.
Build hashes of AWS and pure ES is not equal so, ES is modified by AWS team and we do not know exact changes. There can be some changes that might affect search in posted question.
Need to reproduce this behavior on pure ES cluster before we will continue our conversation.
I use kafka_2.11-0.9.0.1, I try two version of jason config files. I can get JVM info like heapmem and GC
enter image description here
But when I wanted to get kafka metrics, there is nothing out. This is the jmxtrans log.
enter image description here
And more, This is two version jason file I user:
{
"servers" : [ {
"port" : "9999",
"host" : "localhost",
"queries" : [ {
"outputWriters" : [ {
"#class" : "com.googlecode.jmxtrans.model.output.StdOutWriter",
"settings" : {
}
} ],
"obj" : "kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec,topic=test",
"attr" : [ "Count"]
},{
"outputWriters" : [ {
"#class" : "com.googlecode.jmxtrans.model.output.StdOutWriter",
"settings" : {
}
} ],
"obj" : "kafka.server:type=BrokerTopicMetrics,name=*",
"resultAlias": "Kafka",
"attr" : [ "Count","OneMinuteRate"]
}
],
"numQueryThreads" : 2
} ]
}
the other is :
{
"outputWriters" : [ {
"#class" : "com.googlecode.jmxtrans.model.output.KeyOutWriter",
"settings" : {
"outputFile" : "testowo-counts3.txt",
"maxLogFileSize" : "10MB",
"maxLogBackupFiles" : 200,
"delimiter" : "\t",
"debug" : true
}
} ],
"obj": "\"kafka.network\":type=\"RequestMetrics\",name=\"Produce-RequestsPerSec\"",
"resultAlias": "produce",
"attr": [
"Count",
"OneMinuteRate"
]
} ,{
"outputWriters" : [ {
"#class" : "com.googlecode.jmxtrans.model.output.KeyOutWriter",
"settings" : {
"outputFile" : "testowo-gc.txt",
"maxLogFileSize" : "10MB",
"maxLogBackupFiles" : 200,
"delimiter" : "\t",
"debug" : true
}
} ],
"obj": "java.lang:type=GarbageCollector,name=*",
"resultAlias": "GC",
"attr": [
"CollectionCount",
"CollectionTime"
]
}
This is the version problem. I recommend jconsole to see the Mbeans tree. It helps a lot.
I have such structure of document:
{
"_id" : "4e76fd1e927e1c9127d1d2e8",
"name" : "***",
"embedPhoneList" : [
{
"type" : "家庭",
"number" : "00000000000"
},
{
"type" : "手机",
"number" : "00000000000"
}
],
"embedAddrList" : [
{
"type" : "家庭",
"addr" : "山东省诸城市***"
},
{
"type" : "工作",
"addr" : "深圳市南山区***"
}
],
"embedEmailList" : [
{
"email" : "********#gmail.com"
},
{
"email" : "********#gmail.com"
},
{
"email" : "********#gmail.com"
},
{
"email" : "********#gmail.com"
}
]
}
What I wan't to do is find the document by it's sub document,such as email in embedEmailList field.
Or if I have structure like this
{
"_id" : "4e76fd1e927e1c9127d1d2e8",
"name" : "***",
"embedEmailList" : [
"123#gmail.com" ,
"********#gmail.com" ,
]
}
the embedEmailList is array,how to find if there is 123#gmail.com?
Thanks.
To search for a specific value in an array, mongodb supports this syntax:
db.your_collection.find({embedEmailList : "foo#bar.com"});
See here for more information.
To search for a value in an embedded object, it supports this syntax:
db.your_collection.find({"embedEmailList.email" : "foo#bar.com"});
See here for more information.