Below query will do filter and aggregation how to translate this to java code. Query works from postman the same needs to be converted into java using java client api. and i am using rest high level client as elastic search client. i tried with the below java code but the generated query is bit different than the the actual below is java code which i have tried.
BoolQueryBuilder booleanQuery = QueryBuilders.boolQuery();
booleanQuery.filter(QueryBuilders
.queryStringQuery(String.join(" OR ", exactMatchThese))
.field("events.recommendationData.exceptionId"));
QueryBuilder queryBuilder = QueryBuilders.nestedQuery("events.recommendationData", booleanQuery, ScoreMode.None);
Search Query which is working
GET <index-name>/_search
{
"query": {
"bool": {
"filter": [
{
"nested": { --> note
"path": "events.recommendationData",
"query": {
"query_string": {
"query": "\"1\" OR \"2\"",
"fields": [
"events.recommendationData.exceptionId"
],
"type": "best_fields",
"default_operator": "or",
"max_determinized_states": 10000,
"enable_position_increments": true,
"fuzziness": "AUTO",
"fuzzy_prefix_length": 0,
"fuzzy_max_expansions": 50,
"phrase_slop": 0,
"escape": false,
"auto_generate_synonyms_phrase_query": true,
"fuzzy_transpositions": true,
"boost": 1
}
}
}
}
]
}
},
"size": 1,
"aggs": {
"genres": {
"nested": {
"path": "events.recommendationData.recommendations"
},
"aggs": {
"nested_comments_recomms": {
"terms": {
"field": "events.recommendationData.recommendations.recommendationType"
}
}
}
}
}
}
Below Search Query Generated from above java code which i have mentioned and is not working.
{
"query": {
"nested": {
"query": {
"bool": {
"filter": [
{
"query_string": {
"query": "\"1\" OR \"2\"",
"fields": [
"events.recommendationData.exceptionId^1.0"
],
"type": "best_fields",
"default_operator": "or",
"max_determinized_states": 10000,
"enable_position_increments": true,
"fuzziness": "AUTO",
"fuzzy_prefix_length": 0,
"fuzzy_max_expansions": 50,
"phrase_slop": 0,
"escape": false,
"auto_generate_synonyms_phrase_query": true,
"fuzzy_transpositions": true,
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"path": "events.recommendationData",
"ignore_unmapped": false,
"score_mode": "none",
"boost": 1
}
},
"aggregations": {
"recommendationTypes": {
"terms": {
"field": "events.recommendationData.recommendations.recommendationType",
"size": 10,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": false,
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
}
}
}
}
Your inner-most query block is query string i.e.
QueryStringQueryBuilder queryString = QueryBuilders
.queryStringQuery(String.join(" OR ", exactMatchThese));
This is the query part of nested query hence we create a nested query and assign the above query to it as written below,
NestedQueryBuilder nestedQuery = QueryBuilders
.nestedQuery("events.recommendationData", queryString, ScoreMode.None);
Finally add the above query to the filter clause of bool query,
BoolQueryBuilder boolQuery = QueryBuilders.boolQuery().filter(nestedQuery);
All together this is,
QueryStringQueryBuilder queryString = QueryBuilders
.queryStringQuery(String.join(" OR ", exactMatchThese));
NestedQueryBuilder nestedQuery = QueryBuilders
.nestedQuery("events.recommendationData", queryString, ScoreMode.None);
BoolQueryBuilder boolQuery = QueryBuilders.boolQuery().filter(nestedQuery);
Related
I am using Java to perform queries on Elasticsearch, via the ElasticSearchClient. As there are big variables returned, I would like to only retrieve the ones that are relevant but the variables in _source are nested.
Below is a sample index response (multiple indexes can be returned with same _source structure)
[
{
"_index": "kn-tas-20200630",
"_type": "_doc",
"_id": "1122334455",
"_score": null,
"_source": {
"variables": [
{
"rawValue": "DEFH",
"name": "MANAGER"
},
{
"rawValue": "ABCD",
"name": "EMPLOYEE"
},
{
"rawValue": "[{\"rowId\":102030,\"rowType\":\"SIM\"}]",
"name": "extData"
}
]
},
"sort": [
1665735632119
]
}
]
I would like to create a query using SearchSourceBuilder to query ES and only retrieve the following:
Get the rawValue by name (I provide Manager, I get "DFEH")
Get the rowType value (I provide extData + row Type, I get "SIM")
Below is my query:
{
"from": 0,
"size": 100,
"query": {
"bool": {
"must": [
{
"terms": {
"prcKey": [
"K-112"
],
"boost": 1.0
}
}
],
"must_not": [
{
"exists": {
"field": "endDate",
"boost": 1.0
}
},
{
"term": {
"personInCharge": {
"value": "ABC",
"boost": 1.0
}
}
}
],
"adjust_pure_negative": true,
"boost": 1.0
}
},
"_source": {
"includes": [
"variables.name",
"variables.rawValue"
],
"excludes": []
},
"sort": [
{
"createTime": {
"order": "desc"
}
}
]
}
How can I fix my query? I tried using nested queries but without any luck.
I have created a composite query for aggregating on 2 different attributes as below -
{
"from": 0,
"size": 0,
"query": {
"bool": {
"must": [
{
"nested": {
"query": {
"script": {
"script": {
"source": "params.territoryIds.contains(doc['territoryHierarchy.id'].value) ",
"lang": "painless",
"params": {
"territoryIds": [
12345678
]
}
},
"boost": 1.0
}
},
"path": "territoryHierarchy",
"ignore_unmapped": false,
"score_mode": "none",
"boost": 1.0
}
},
{
"bool": {
"should": [
{
"nested": {
"query": {
"script": {
"script": {
"source": "doc['forecastHeaders.id'].value == params.id && doc['forecastHeaders.revenueCategory'].value == params.revenueCategory ",
"lang": "painless",
"params": {
"revenueCategory": 0,
"id": 987654321
}
},
"boost": 1.0
}
},
"path": "forecastHeaders",
"ignore_unmapped": false,
"score_mode": "none",
"boost": 1.0
}
},
{
"nested": {
"query": {
"script": {
"script": {
"source": "doc['forecastHeaders.id'].value == params.id && doc['forecastHeaders.revenueCategory'].value == params.revenueCategory ",
"lang": "painless",
"params": {
"revenueCategory": 0,
"id": 987654321
}
},
"boost": 1.0
}
},
"path": "forecastHeaders",
"ignore_unmapped": false,
"score_mode": "none",
"boost": 1.0
}
}
],
"adjust_pure_negative": true,
"boost": 1.0
}
},
{
"terms": {
"revnWinProbability": [
40,
50
],
"boost": 1.0
}
},
{
"terms": {
"revenueStatus.keyword": [
"OPEN"
],
"boost": 1.0
}
},
{
"range": {
"recordUpdateTime":{
"gte":1655117440000
}
}
}
],
"adjust_pure_negative": true,
"boost": 1.0
}
},
"version": true,
"aggregations": {
"TopLevelAggregation": {
"composite" : {
"size" : 10000,
"sources" : [
{
"directs": {
"terms": {
"script": {
"source": "def territoryNamesList = new ArrayList(); def name; def thLength = params._source.territoryHierarchy.length; for(int i = 0; i< thLength;i++) { def thRecord = params._source.territoryHierarchy[i]; if (params.territoryIds.contains(thRecord.id) && i+params.levelToReturn < thLength) { territoryNamesList.add(params._source.territoryHierarchy[i+params.levelToReturn].name);} } return territoryNamesList;",
"lang": "painless",
"params": {
"territoryIds": [
12345678
],
"levelToReturn": 1
}
}
}
}
},
{
"qtr" : {
"terms" : {
"field" : "quarter.keyword",
"missing_bucket" : false,
"order" : "asc"
}
}
}
]
},
"aggregations": {
"revnRevenueAmount": {
"sum": {
"script": {
"source": "doc['revenueTypeCategory.keyword'].value != 'Other' ? doc['revnRevenueAmount']:doc['revnRevenueAmount']",
"lang": "painless"
},
"value_type": "long"
}
}
}
}
}
}
So this query does a composite aggregation based on two different terms aggregations, directs and qtr, and it works fine.
Now I am trying to create a corresponding spring data java client implementation for it. So I have created the code as below -
BoolQueryBuilder baseQueryBuilder = getQueryBuilder(searchCriteria);
List<TermsAggregationBuilder> aggregationBuilders = getMultiBaseAggregationBuilders(searchCriteria, baseQueryBuilder);
Where the bool query supplies the first part of the bool query and the getMultiBaseAggregationBuilders method returns the 2 different terms aggregations shown in the query above - directs and qtr. Now I am not finding any API to send this list of terms aggregations to the composite aggregation builder. Would be really grateful if someone can give me a pointer as to how this list of terms aggregations can be used inside the composite aggregation builder so the same can be achieved in the java code as it shows in the elastic query above. Thanks in advance.
I tried to write a filter query using elastic search Java API version 7.6
But there is no good documentation on how to write a filter context search.
Anyone know how to write Java API for the following:
GET /_search
{
"query": {
"bool": {
"must": [
{ "match": { "title": "Search" }},
{ "match": { "content": "Elasticsearch" }}
],
"filter": [
{ "term": { "status": "published" }},
{ "range": { "publish_date": { "gte": "2015-01-01" }}}
]
}
}
}
Try the following
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
List<QueryBuilder> mustClauses = boolQueryBuilder.must();
mustClauses.add(QueryBuilders.matchQuery("title", "Search"));
mustClauses.add(QueryBuilders.matchQuery("content", "Elasticsearch"));
List<QueryBuilder> filterClauses = boolQueryBuilder.filter();
filterClauses.add(QueryBuilders.termQuery("status", "published"));
filterClauses.add(QueryBuilders.rangeQuery("publish_date").gte("2015-01-01"));
SearchRequest searchRequest = new SearchRequest();
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(boolQueryBuilder);
searchRequest.source(searchSourceBuilder);
System.out.println(searchRequest.toString());
The resulting query is
{
"query": {
"bool": {
"must": [
{
"match": {
"title": {
"query": "Search",
"operator": "OR",
"prefix_length": 0,
"max_expansions": 50,
"fuzzy_transpositions": true,
"lenient": false,
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": true,
"boost": 1.0
}
}
},
{
"match": {
"content": {
"query": "Elasticsearch",
"operator": "OR",
"prefix_length": 0,
"max_expansions": 50,
"fuzzy_transpositions": true,
"lenient": false,
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": true,
"boost": 1.0
}
}
}
],
"filter": [
{
"term": {
"status": {
"value": "published",
"boost": 1.0
}
}
},
{
"range": {
"publish_date": {
"from": "2015-01-01",
"to": null,
"include_lower": true,
"include_upper": true,
"boost": 1.0
}
}
}
],
"adjust_pure_negative": true,
"boost": 1.0
}
}
}
edit: I'm using elastic search 7.3.0
I'm trying to do a query with an aggregation and sub aggregation, but the sub aggregation is absent from the SearchResponse.
As part of debugging, I ran my query in a unit test, copied the query, and ran it manually with postman. There, the response is exactly what I expect, but for some reason, in my java code, parts are missing.
SearchRequest request = new SearchRequest("index");
SearchSourceBuidler search = new SearchSourceBuilder();
SortBuilder sortByDate = SortBuilders
.fieldSort("date")
.order(SortOrder.DESC);
// Getting the latest result for each bucket
TopHitsAggregationBuilder latestResults = AggregationBuilders
.topHits("latest")
.sort(sortByDate)
.fetchSource("*","")
.size(1);
// Aggregate per service
TermsAggregationBuilder perService = AggregationBuilders
.terms("services")
.field("service.service_id")
.subAggregation(latestResults);
search.aggregation(perService);
search.size(1);
request.source(search);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
Here is the request generated:
{
"size": 0,
"aggregations": {
"services": {
"terms": {
"field": "service.service_id",
"size": 10,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": false,
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
},
"aggregations": {
"latest": {
"top_hits": {
"from": 0,
"size": 1,
"version": false,
"seq_no_primary_term": false,
"explain": false,
"_source": {
"includes": [
"*"
],
"excludes": [
""
]
},
"sort": [
{
"date": {
"order": "desc"
}
}
]
}
}
}
}
}
}
In my code, response is
{
...
"aggregations": {
"sterms#services": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": []
}
}
}
If I run the same query manually I get
{
...
"aggregations": {
"services": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "09045f59-3709-4769-8c92-d611f773a401",
"doc_count": 2,
"latest": {
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": null,
"hits": [ ... ]
I do a query on Elasticsearch from Kibana 4.4.1 which looks like this :
{
"size": 0,
"query": {
"filtered": {
"query": {
"query_string": {
"query": "FALK0911622560T",
"analyze_wildcard": true
}
},
"filter": {
"bool": {
"must": [
{
"range": {
"#timestamp": {
"gte": 1438290000000,
"lte": 1440968400000,
"format": "epoch_millis"
}
}
}
],
"must_not": []
}
}
}
},
"aggs": {
"2": {
"date_histogram": {
"field": "#timestamp",
"interval": "1w",
"time_zone": "Europe/Helsinki",
"min_doc_count": 1,
"extended_bounds": {
"min": 1438290000000,
"max": 1440968400000
}
},
"aggs": {
"1": {
"percentiles": {
"field": "Quantity",
"percents": [
50
]
}
}
}
}
}
}
This piece of code will return all the docs with "ProductCode" = FALK0911622560T", between the given interval.
I tried the same thing with Elasticsearch Java API with the following code :
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery().must(QueryBuilders.matchQuery(matchQueryKey,matchQueryValue));
SearchResponse response = client.prepareSearch(indexName)
.setTypes(indexTypeName)
.setQuery(boolQueryBuilder)
.setSize(100)
.addAggregation(AggregationBuilders
.dateHistogram("myHistogram")
.field("#timestamp")
.interval(DateHistogramInterval.WEEK)
.timeZone("Europe/Helsinki")
.minDocCount(1)
.extendedBounds(1438290000000L, 1440968400000L))
.addFields(fieldsOfInterest)
.execute()
.actionGet();
response.getAggregations();
But I get all the documents in the index with "ProductCode" = FALK0911622560T.
Between the given time, I should have only 5 documents on response.getAgregations() because I set the interval to be Week.
A doc in Elasticsearch looks like this :
{
"_index": "warehouse-550",
"_type": "core2",
"_id": "AVOKCqQ68h4KkDGZvk6b",
"_score": null,
"_source": {
"message": "5,550,67.01,FALK0911622560T,2015-07-31;08:00:00.000\r",
"#version": "1",
"#timestamp": "2015-07-31T06:00:00.000Z",
"path": "D:/Programs/Logstash/x_testingLocally/processed-stocklevels-550-25200931072015.csv",
"host": "EVO385",
"type": "core2",
"Quantity": 5,
"Warehouse": "550",
"Price": 67.01,
"ProductCode": "FALK0911622560T",
"Timestamp": "2015-07-31;08:00:00.000"
},
"fields": {
"#timestamp": [
1438322400000
]
},
"highlight": {
"ProductCode": [
"#kibana-highlighted-field#FALK0911622560T#/kibana-highlighted-field#"
],
"message": [
"5,550,67.01,#kibana-highlighted-field#FALK0911622560T#/kibana-highlighted-field#,2015-07-31;08:00:00.000\r"
]
},
"sort": [
1438322400000
]
}
Please help.
Thank you.
You did not add the rangeQuery. Change your boolQueryBuilder to following:
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery().must(QueryBuilders.matchQuery(matchQueryKey,matchQueryValue)).must(QueryBuilders.rangeQuery("#timestamp").gte(fromValue).lte(toValue));
You can get buckets using:
InternalDateHistogram histogram = searchResponse.getAggregations().getAsMap().get(aggregation_name);
List bucketList = histogram?.getBuckets()