How to generate below query using elasticsearch java api - java

I want to generate similar query using Elasticsearch JAVA API . I am trying to apply filters at an aggregation level.
{
"query":{
"filtered":{
"filter":{ "terms":{ "family_name":"Brown" } } //filter_1
}
},
"aggs":{
"young_age":{
"filter":{
"terms" : {
"gender" : "male" //filter_2
}
},
"aggs":{
"age":{
"terms":{
"field":"age"
}
}
}
}
}
}
Please find the sample code that i am up to
TermFilterBuilder family_filter_1 = FilterBuilders.termFilter("family_name","Brown");
FilteredQueryBuilder qbuilder =QueryBuilders.filteredQuery(QueryBuilders.matchAllQuery(),family_filter_1);
SearchRequestBuilder search = client.prepareSearch("test_index")
.setTypes("test_type")
.setSearchType(SearchType.COUNT)
.setQuery(qbuilder);
search.addAggregation(terms("age").field("age")
.size(0)// Size 0 returns all the "group by keys"
.order(Terms.Order.count(true))); // to sort the output
System.out.println(""+search);
and response i am getting . Please suggest how to add filter_2
{
"query" : {
"filtered" : {
"query" : {
"match_all" : { }
},
"filter" : {
"term" : {
"family_name" : "Brown"
}
}
}
},
"aggregations" : {
"age" : {
"terms" : {
"field" : "age",
"size" : 0,
"order" : {
"_count" : "asc"
}
}
}
}
}
Thanks in advance..

You can do it like this:
// same as your code
TermFilterBuilder family_filter_1 = ...;
FilteredQueryBuilder qbuilder = ...;
SearchRequestBuilder search = ...;
// build the range filter
RangeQuery ageRange = QueryBuilders.rangeQuery("age")
.from(18).to(40).includeLower(false).includeUpper(false);
// build the terms sub-aggregation
TermsAggregation age = AggregationBuilders.terms("age")
.field("age")
.size(0)
.order(Terms.Order.count(true));
// build the filter top-aggregation
FilterAggregationBuilder youngAge = AggregationBuilders
.filter("young_age")
.filter(ageRange)
.subAggregation(age);
search.addAggregation(youngAge);

Related

Spring data elasticsearch bulk index on Percolator

Using spring data elasticsearch I want to do the following (bulk Indexing)
Step 1
PUT surname
{
"settings" : {
"index" : {
"number_of_shards" : 1,
"number_of_replicas" : 0
}
},
"mappings": {
"properties": {
"message": {
"type": "text"
},
"query": {
"type": "percolator"
}
}
}
}
Step 2:
PUT surname/_bulk
{ "index" : { "_index" : "surname", "_type" : "_doc", "_id" : "1" } }
{ "query" : { "match_phrase" : { "message" : "Aal" } }}
{ "index" : { "_index" : "surname", "_type" : "_doc", "_id" : "2" } }
{ "query" : { "match_phrase" : { "message" : "Aalbers" } }}
Step 1 is done with the help of - https://stackoverflow.com/a/67724048/4068218
For Step 2 I tried
IndexQuery and UpdateQuery
Query query = new CriteriaQuery(new Criteria("match_phrase").subCriteria(new Criteria("message").is("Aal")));
UpdateQuery updateQuery = builder(query).build();
elasticsearchOperations.bulkUpdate(ImmutableList.of(updateQuery),IndexCoordinates.of(indexName.get()));
but both do not work. If I use UpdateQuery I get Validation Failed: 1: id is missing;2: script or doc is missing.
If I use IndexQuery I get malformed query.
How do I go about Step 2 in spring data elasticsearch? Any lead is much appreciated.
Below code worked for me. Sharing here as it might help someone
IndexQuery indexQuery = new IndexQueryBuilder()
.withId("12")
.withSource(String.format("{ \"query\" : { \"match_phrase\" : { \"message\" : \"%s\" } }}", "Aal"))
.build();
elasticsearchOperations.bulkIndex(ImmutableList.of(indexQuery),IndexCoordinates.of(indexName.get()));

Issue in extracting the sum aggregation in ElasticSearch multi field group by using JavaAPI

Using ElasticSearch 5.2 and a group by is being done similer to
select city,institutionId, SUM(appOpenCount) from XYZ where ( time > 123 && appOpenCount > 0 ) group by city, institutionId.
I have it working when i do using curl method, but when the same is being converted to java api i am missing something that is causing me not get the last part of sum aggregation.
I have a type temp_type with mapping given below.
{
"temp_index" : {
"mappings" : {
"temp_type" : {
"properties" : {
"appOpenCount" : {
"type" : "integer"
},
"city" : {
"type" : "keyword"
}
"institutionId" : {
"type" : "keyword"
},
"time" : {
"type" : "long"
}
}
}
}
}
}
and my aggregation XGET call looks like this.
curl -XGET "http://localhost:9200/temp_index/temp_type/_search?pretty" -d'
{
"size":0,
"_source":false,
"from" : 0,
"query": {
"bool": {
"must": [
{"range": { "time": { "gte": 1513744603000 } } },
{ "range": { "appOpenCount": { "gt": 0 } } }
]
}
},
"aggregations": {
"city-aggs": {
"terms": { "field": "city"},
"aggregations": {
"intitution-agg": {
"terms": { "field": "institutionId" },
"aggregations": {
"appOpenCount": { "sum": { "field": "appOpenCount" }}}
}
}
}
}
}'
The response is perfect ( the aggregated number mathematically makes sense )
{
"took" : 57,
"timed_out" : false,
"_shards" : { ... },
"hits" : {... },
"aggregations" : {
"city-aggs" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "city-1",
"doc_count" : 25,
"intitution-agg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "inst-1",
"doc_count" : 5,
"appOpenCount" : {
"value" : 15.0
}
}
]
}
}
]
}
}
Using this as template i converted this to Java API call and it i am able to execute it and access city-agg key and institution-agg key but am not sure how to access the appOpenCount agg. Basically getting null for Sum aggregation.
// bool query
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
List<QueryBuilder> mustQueries = boolQueryBuilder.must();
mustQueries.add(QueryBuilders.rangeQuery("time").gte(startTime));
mustQueries.add(QueryBuilders.rangeQuery("appOpenCount").gt(0));
queryBuilder = boolQueryBuilder;
// aggregationbuilder
AggregationBuilder aggregationBuilder = null;
TermsAggregationBuilder cityAggs = AggregationBuilders.terms("city-aggs").field("city");
TermsAggregationBuilder institutionAggs = AggregationBuilders.terms(
"institution-agg").field("institutionId");
SumAggregationBuilder fieldAggBuilder = AggregationBuilders.sum("appOpenCount").field("appOpenCount");
aggregationBuilder = cityAggs.subAggregation(institutionAggs).subAggregation(fieldAggBuilder);
// search call
SearchResponse searchResponse = client.prepareSearch(indexName)
.setTypes(typeName)
.setQuery(queryBuilder)
.addAggregation(aggregationBuilder)
.setFrom(0)
.setSize(0)
.execute().actionGet();
// Iterate the searchResponse
Terms cityAggsTerms = searchResponse.getAggregations().get("city-aggs");
List<Terms.Bucket> mainCityBuckets = cityAggsTerms.getBuckets();
for (Terms.Bucket mainCityBucket : mainCityBuckets) {
String cityName = mainCityBucket.getKeyAsString();
LOGGER.info("CityName : " + cityName); // all good
Terms institutionTerms = mainCityBucket.getAggregations().get("institution-agg");
List<Terms.Bucket> institutionBuckets = institutionTerms.getBuckets();
for (Terms.Bucket institutionBucket : institutionBuckets) {
String institutionName = institutionBucket.getKeyAsString();
LOGGER.info("InstitutionName : " + institutionName ); // all good
Sum appOpenCountSum = institutionBucket.getAggregations().get("appOpenCount");
if(appOpenCountSum != null) {
double appOpenCount = appOpenCountSum.getValue();
LOGGER.info("InstitutionName : " + institutionName +
" and appOpenCount is " + appOpenCount);
} else {
LOGGER.info("appOpenCountSum is null");
}
} // institution for
}// city for
How can i access the value of appOpenCount aggregation. I am hitting the case where my "appOpenCountSum" variable is null. Any help would be appreciated. I am able to access the city-agg and institution-agg and get proper values too. Not sure how to access the appOpenCount aggregation inside Term.Bucket
I followed the example provided in elastic search docs for this
https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/_metrics_aggregations.html#java-aggs-metrics-sum
Have given in-depth breakdown and hopefully it helps others too.
EDIT : Issue was the way i was building the aggregation query in java. The fieldAggBuilder should be added to institutionAggs and not the way i had done previously. The corrected code below.
// aggregationbuilder
AggregationBuilder aggregationBuilder = null;
TermsAggregationBuilder cityAggs = AggregationBuilders.terms("cityaggs").field("city");
TermsAggregationBuilder institutionAggs = AggregationBuilders.terms(
"institution-agg").field("institutionId");
SumAggregationBuilder fieldAggBuilder =
AggregationBuilders.sum("appOpenCount").field("appOpenCount");
institutionAggs.subAggregation(fieldAggBuilder); // this was missing previously
aggregationBuilder = cityAggs.subAggregation(institutionAggs);

Why my java elasticsearch request translation is not valid?

I'm currently making an elasticsearch request to retrieves some data. I have succeeded to write the right request in Json format. After that I tried to translate this one into Java. But when I print the request that the Java sends to ES, both requests are not the same and I don't achieve to make that.
Here is the Json request that returns the GOOD data:
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{ "terms": { "accountId": ["107276450147"] } },
{"range" : {
"date" : {
"lt" : "1480612801000",
"gte" : "1478020801000"
} }
}]
}
}
}
},
"size" : 0,
"aggregations" : {
"field-aggregation" : {
"terms" : {
"field" : "publicationId",
"size" : 2147483647
},
"aggregations" : {
"top-aggregation" : {
"top_hits" : {
"size" : 1,
"_source" : {
"includes" : [ ],
"excludes" : [ ]
}
}
}
}
}
}
}
And the Java generated request... which does not return good data..
{
"from" : 0,
"size" : 10,
"aggregations" : {
"field-aggregation" : {
"terms" : {
"field" : "publicationId",
"size" : 2147483647
},
"aggregations" : {
"top-aggregation" : {
"top_hits" : {
"size" : 1,
"_source" : {
"includes" : [ ],
"excludes" : [ ]
}
}
}
}
}
}
}
And finally the java code that generate the wrong json request:
TopHitsBuilder top = AggregationBuilders.topHits("top-aggregation")
.setFetchSource(true)
.setSize(1);
TermsBuilder field = AggregationBuilders.terms("field-aggregation")
.field(aggFieldName)
.size(Integer.MAX_VALUE)
.subAggregation(top);
BoolFilterBuilder filterBuilder = FilterBuilders.boolFilter()
.must(FilterBuilders.termsFilter("accountId", Collections.singletonList("107276450147")))
.must(FilterBuilders.rangeFilter("date").gte(1478020801000L).lte(1480612801000L));
NativeSearchQueryBuilder query = new NativeSearchQueryBuilder()
.withQuery(QueryBuilders.filteredQuery(QueryBuilders.matchAllQuery(), filterBuilder))
.withIndices("metric")
.withTypes(type)
.addAggregation(field);
return template.query(query.build());
First of all, I must remove the "size":10 and the "from" that the Java generates... And after I have to add the filters. I did this but it's never added..
Can you tell what is wrong in my java code and why the filters does not appears in the final Json?
Thanks guys.
Thanks guys. I finally manage the problem. The java sent the good query but I was looked at the wrong place in ES java API. Nevertheless, I added the request type to COUNT in order to avoid ES send me back the non-aggregated data that are useless for me.

Mongo DB Query in Arraylist

In my Java Play framework application, I want to store the ArrayList values in mongoDB.
{
"_id" : ObjectId("5832f29bd4c6721e4e8ba4a7"),
"_class" : "com.netas.innovation.entity.Idea",
"title" : "fsaf",
"desc" : "adgg",
"keyWords" : "dgds",
"createdDate" : ISODate("2016-11-21T13:11:55.823Z"),
"checkbox1" : false,
"checkbox2" : false,
"checkbox3" : false,
"scopeOfIdea" : "Herkes",
"template" : false,
"creatorUser" : {
"$ref" : "user",
"$id" : ObjectId("5832f27dd4c6721e4e8ba4a5")
},
"owners" : [
{
"$ref" : "user",
"$id" : ObjectId("5832f27dd4c6721e4e8ba4a5")
}
],
"answer" : {
"$ref" : "answer",
"$id" : ObjectId("5832f29bd4c6721e4e8ba4a6")
},
"fileList" : []
}
i want to search in owners.
My query doesnt work
if(owners != null && !owners.isEmpty()) {
for(int i=0; i<owners.size(); i++) {
criteriaList.add(new **Criteria().elemMatch(Criteria.where("owners.$id").is(owners.get(i).getId())));**
}
}
How can i fix?
i can search by owners.
owners can be two people or three
Thanks for answers
"owners" : [
{
"$ref" : "user(list)",
"$id" : ObjectId("5832ecdb0deb78cc88392c83")
}
$ref : ozgurk,volkany ...
http://prntscr.com/ddz3c9 this example for output... title desc vs vsdate and owners
I'm not a mongodb expert but, I guess you should do somethink like:
List<Criteria> criteriaList = new ArrayList<Criteria>();
...
List idList = new ArrayList();
for (int i = 0; i < owners.size(); i++) {
idList.add(owners.get(i).getId());
}
criteriaList.add(new Criteria().where("owners.$id").in(idList));
...
If you add each owner=owners.get(i).getId() criteria to the list one by one and finally combine all criterias with AND operation you will not get your desired output.
I guess you use Spring data. I tried to write following mongodb query:
db.getCollection('idea').find(
{
"owners.$id": {
"$in" : [
ObjectId("58451c5f13c97bdde9950641"),
ObjectId("28451c5f13c97bdde9950642")
]
}
}
)

Elasticsearch combining queries with Boolean query

I'm trying to combine mutiple queries in elasticsearch using a boolean query but the result is not what I'm expecting. For example:
If I have the following documents (among others):
DOC 1:
{
"name":"Iphone 5",
"product_suggestions":{
"input":[
"iphone 5",
"apple"
]
},
"description":"Iphone 5 - The almost last version",
"brand":"Apple",
"brand_facet":"Apple",
"state_id":"2",
"user_state_description":"Almost New",
"product_type_id":"1",
"current_price":350,
"finish_date":"2014/06/20 14:12",
"finish_date_ms":1403273520
}
DOC 2:
{
"name":"Apple II Lisa",
"product_suggestions":{
"input":[
"apple ii lisa",
"apple"
]
},
"description":"Make a offer and I Apple II Lisa!!",
"brand":"Apple",
"brand_facet":"Apple",
"state_id":"2",
"user_state_description":"Used",
"product_type_id":"1",
"current_price":150,
"finish_date":"2014/06/15 16:12",
"finish_date_ms":1402848720
}
DOC 3:
{
"name":"Iphone 5s",
"product_suggestions":{
"input":[
"iphone 5s",
"apple"
]
},
"description":"Iphone 5s 32Gb like new with a few scratches bla bla bla",
"brand":"Apple",
"brand_facet":"Apple",
"state_id":"1",
"user_state_description":"New",
"product_type_id":"2",
"current_price":510.1,
"finish_date":"2014/06/10 14:12",
"finish_date_ms":1402409520
}
DOC 4:
{
"name":"Iphone 4s",
"product_suggestions":{
"input":[
"iphone 4s",
"apple"
]
},
"description":"Iphone 4s 16Gb Mint conditions and unlocked to all network",
"brand":"Apple",
"brand_facet":"Apple",
"state_id":"1",
"user_state_description":"Almost New",
"product_type_id":"2",
"current_price":385,
"finish_date":"2014/06/12 16:12",
"finish_date_ms":1402589520
}
And if I run the following query (Get all documents and facets with the keyword "Apple" that the finish_date_ms is bigger than 1402869581)
{
"from" : 1,
"size" : 20,
"query" : {
"bool" : {
"must" : {
"query_string" : {
"query" : "apple",
"default_operator" : "and",
"analyze_wildcard" : true
}
},
"must_not" : {
"range" : {
"finish_date_ms" : {
"from" : null,
"to" : 1402869581,
"include_lower" : true,
"include_upper" : false
}
}
}
}
},
"facets" : {
"brand" : {
"terms" : {
"field" : "brand_facet",
"size" : 10
}
},
"product_type_id" : {
"terms" : {
"field" : "product_type_id",
"size" : 10
}
},
"state_id" : {
"terms" : {
"field" : "state_id",
"size" : 10
}
}
}
}
This returns:
{
"took":5,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":0.18392482,
"hits":[
]
},
"facets":{
"brand":{
"_type":"terms",
"missing":0,
"total":1,
"other":0,
"terms":[
{
"term":"Apple",
"count":1
}
]
},
"product_type_id":{
"_type":"terms",
"missing":0,
"total":1,
"other":0,
"terms":[
{
"term":1,
"count":1
}
]
},
"state_id":{
"_type":"terms",
"missing":0,
"total":1,
"other":0,
"terms":[
{
"term":2,
"count":1
}
]
}
}
}
And should return only the document DOC1. If I remove the range query, returns all the documents that has Apple word. If I remve the "term" query then n document is returns, so I presume the problem is in the range query.
Can anyone point me in the right direction with this?
One other important thing, all this query is to be implemented in java (if this help).
Thanks!
(sory for this huge post)
I found my mistake. (newbie mistake to be honest)
The problem was not in the range query but in the begging of the Json: The from field is set to 1 but the result is only one record so this should be 0!!
Thanks for everything!!

Categories