How to add Bucket Sort to Query Aggregation

How to add Bucket Sort to Query Aggregation - java

I have a ElasticSearch Query that is working well (curl), is my first Query,
First I am filtering by Organization (Multitenancy), then group by Customer, Finally sum the amount of the sales but I only want to have the 3 best customers.
My question is.. How to build the aggregation with the AggregationBuilders to get "bucket_sort" statement. I got the sales grouping by customer with Java API.
Elastic Query is:
curl -X POST 'http://localhost:9200/sales/sale/_search?pretty' -H 'Content-Type: application/json' -d '
{
"aggs": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"organization_id": "15"
}
}
]
}
},
"aggs": {
"by_customer": {
"terms": {
"field": "customer_id"
},
"aggs": {
"sum_total" : {
"sum": {
"field": "amount"
}
},
"total_total_sort": {
"bucket_sort": {
"sort": [
{"sum_total": {"order": "desc"}}
],
"size": 3
}
}
}
}
}
}
}
}'
My Java Code:
#Test
public void queryBestCustomers() throws UnknownHostException {
Client client = Query.client();
AggregationBuilder sum = AggregationBuilders.sum("sum_total").field("amount");
AggregationBuilder groupBy = AggregationBuilders.terms("by_customer").field("customer_id").subAggregation(sum);
AggregationBuilder aggregation =
AggregationBuilders
.filters("filtered",
new FiltersAggregator.KeyedFilter("must", QueryBuilders.termQuery("organization_id", "15"))).subAggregation(groupBy);
SearchRequestBuilder requestBuilder = client.prepareSearch("sales")
.setTypes("sale")
.addAggregation(aggregation);
SearchResponse response = requestBuilder.execute().actionGet();
}

I hope I got your question right.
Try adding "order" to your groupBy agg:
AggregationBuilder groupBy = AggregationBuilders.terms("by_customer").field("customer_id").subAggregation(sum).order(Terms.Order.aggregation("sum_total", false));
One more thing, if you want the top 3 clients than your .size(3) should be set on groupBy agg as well and not on sorting. like that:
AggregationBuilder groupBy = AggregationBuilders.terms("by_customer").field("customer_id").subAggregation(sum).order(Terms.Order.aggregation("sum_total", false)).size(3);

As another answer mentioned, "order" does work for your use case.
However there are other use cases where one may want to use bucket_sort. For example if someone wanted to page through the aggregation buckets.
As bucket_sort is a pipeline aggregation you cannot use the AggregationBuilders to instantiate it. Instead you'll need to use the PipelineAggregatorBuilders.
You can read more information about the bucket sort/pipeline aggregation here.
The ".from(50)" in the following code is an example of how you can page through the buckets. This causes the items in the bucket to start from item 50 if applicable. Not including "from" is the equivalent of ".from(0)"
BucketSortPipelineAggregationBuilder paging = PipelineAggregatorBuilders.bucketSort(
"paging", List.of(new FieldSortBuilder("sum_total").order(SortOrder.DESC))).from(50).size(10);
AggregationBuilders.terms("by_customer").field("customer_id").subAggregation(sum).subAggregation(paging);

Related

Elasticsearch Subaggregation of tophits JAVA API not working

I want to code elasticsearch aggregation in JAVA API to find field collapsing and result grouping.
The json aggregation code is shown below
I've got these code from elasticsearch docs
'dedup_by_score' aggregation has sub aggregation called 'top_hit' aggregation
and use this in terms aggregation for bucket ordering.
... some query
"aggs": {
"dedup_by_score": {
"terms": {
"field": "keyword",
"order": {
"top_hit": "desc"
},
"size": 10
},
"aggs": {
"top_hit": {
"max": {
"script": {
"source": "_score"
}
}
}
}
}
}
I want to convert this json query into JAVA
And this is what I've already tried in JAVA
AggregationBuilder aggregation = AggregationBuilders.terms("dedup_by_score")
.field("keyword")
.order(BucketOrder.aggregation("top_hit", false))
.size(10)
.subAggregation(
AggregationBuilders.topHits("top_hit")
.subAggregation(
AggregationBuilders.max("max").script(new Script("_score"))
)
);
But I got an error like below from Elasticsearch
{
"type":"aggregation_initialization_exception",
"reason":"Aggregator [top_hit] of type [top_hits] cannot accept sub-aggregations"
}
How can I fix this Java code? I'm using Elasticsearch 6.7.1 version now.
Thanks in advance

Top hit aggs can't have sub-aggs. Try this:
AggregationBuilder aggregation = AggregationBuilders.terms("dedup_by_score")
.field("keyword")
.order(BucketOrder.aggregation("top_hit", false))
.size(10)
.subAggregation(
AggregationBuilders.max("max").script(new Script("_score"))
.subAggregation(
AggregationBuilders.topHits("top_hit")
)
);

Two Aggregate Totals in One Group

I wrote a query in MongoDB as follows:
db.getCollection('student').aggregate(
[
{
$match: { "student_age" : { "$ne" : 15 } }
},
{
$group:
{
_id: "$student_name",
count: {$sum: 1},
sum1: {$sum: "$student_age"}
}
}
])
In others words, I want to fetch the count of students that aren't 15 years old and the summary of their age. The query works fine and I get two data items.
In my application, I want to do the query by Spring Data.
I wrote the following code:
Criteria where = Criteria.where("AGE").ne(15);
Aggregation aggregation = Aggregation.newAggregation(
Aggregation.match(where),
Aggregation.group().sum("student_age").as("totalAge"),
count().as("countOfStudentNot15YearsOld"));
When this code is run, the output query will be:
"aggregate" : "MyDocument", "pipeline" :
[ { "$match" { "AGE" : { "$ne" : 15 } } },
{ "$group" : { "_id" : null, "totalAge" : { "$sum" : "$student_age" } } },
{ "$count" : "countOfStudentNot15YearsOld" }],
"cursor" : { "batchSize" : 2147483647 }
Unfortunately, the result is only countOfStudentNot15YearsOld item.
I want to fetch the result like my native query.

If your're asking to return the grouping for both "15" and "not 15" as a result then you're looking for the $cond operator which will allow a "branching" based on conditional evaluation.
From the "shell" content you would use it like this:
db.getCollection('student').aggregate([
{ "$group": {
"_id": null,
"countFiteen": {
"$sum": {
"$cond": [{ "$eq": [ "$student_age", 15 ] }, 1, 0 ]
}
},
"countNotFifteen": {
"$sum": {
"$cond": [{ "$ne": [ "$student_age", 15 ] }, 1, 0 ]
}
},
"sumNotFifteen": {
"$sum": {
"$cond": [{ "$ne": [ "$student_age", 15 ] }, "$student_age", 0 ]
}
}
}}
])
So you use the $cond to perform a logical test, in this case whether the "student_age" in the current document being considered is 15 or not, then you can return a numerical value in response which is 1 here for "counting" or the actual field value when that is what you want to send to the accumulator instead. In short it's a "ternary" operator or if/then/else condition ( which in fact can be shown in the more expressive form with keys ) you can use to test a condition and decide what to return.
For the spring mongodb implementation you use ConditionalOperators.Cond to construct the same BSON expressions:
import org.springframework.data.mongodb.core.aggregation.*;
ConditionalOperators.Cond isFifteen = ConditionalOperators.when(new Criteria("student_age").is(15))
.then(1).otherwise(0);
ConditionalOperators.Cond notFifteen = ConditionalOperators.when(new Criteria("student_age").ne(15))
.then(1).otherwise(0);
ConditionalOperators.Cond sumNotFifteen = ConditionalOperators.when(new Criteria("student_age").ne(15))
.thenValueOf("student_age").otherwise(0);
GroupOperation groupStage = Aggregation.group()
.sum(isFifteen).as("countFifteen")
.sum(notFifteen).as("countNotFifteen")
.sum(sumNotFifteen).as("sumNotFifteen");
Aggregation aggregation = Aggregation.newAggregation(groupStage);
So basically you just extend off of that logic, using .then() for a "constant" value such as 1 for the "counts", and .thenValueOf() where you actually need the "value" of a field from the document, so basically equal to the "$student_age" as shown for the common shell notation.
Since ConditionalOperators.Cond shares the AggregationExpression interface, this can be used with .sum() in the form that accepts an AggregationExpression as opposed to a string. This is an improvement on past releases of spring mongo which would require you to perform a $project stage so there were actual document properties for the evaluated expression prior to performing a $group.
If all you want is to replicate the original query for spring mongodb, then your mistake was using the $count aggregation stage rather than appending to the group():
Criteria where = Criteria.where("AGE").ne(15);
Aggregation aggregation = Aggregation.newAggregation(
Aggregation.match(where),
Aggregation.group()
.sum("student_age").as("totalAge")
.count().as("countOfStudentNot15YearsOld")
);

java : Case insensitive search in Elasticsearch

I'm trying to find out the documents in the index regardless of whether if it's field values are lowercase or uppercase in the index.
This is the index structure, I have designed with the custom analyzer. I'm new to analyzers and I might be wrong. This is how it looks :
POST arempris/emptagnames
{
"settings": {
"analyzer": {
"lowercase_keyword": {
"type": "custom",
"tokenizer": "keyword",
"filter": "lowercase"
}
}
},
"mappings" : {
"emptags":{
"properties": {
"employeeid": {
"type":"integer"
},
"tagName": {
"type": "text",
"fielddata": true,
"analyzer": "lowercase_keyword"
}
}
}
}
}
In the java back-end, I'm using BoolQueryBuilder to find tagnames using employeeids first. This is what I've coded to fetch the values :
BoolQueryBuilder query = new BoolQueryBuilder();
query.must(new WildcardQueryBuilder("tagName", "*June*"));
query.must(new TermQueryBuilder("employeeid", 358));
SearchResponse response12 = esclient.prepareSearch(index).setTypes("emptagnames")
.setQuery(query)
.execute().actionGet();
SearchHit[] hits2 = response12.getHits().getHits();
System.out.println(hits2.length);
for (SearchHit hit : hits2) {
Map map = hit.getSource();
System.out.println((String) map.get("tagName"));
}
It works fine when I specify the tag to be searched as "june" in lowercase, but when I specify it as "June" in the WildCardQueryBuilder with an uppercase for an alphabet, I'm not getting any match.
Let me know where have I committed the mistake. Would greatly appreciate your help and thanks in advance.

There are two type of queries in elasticsearch
Term level queries -> in which exact term is searched. https://www.elastic.co/guide/en/elasticsearch/reference/current/term-level-queries.html
Full text queries -> which first analyzes the query term and then search it. https://www.elastic.co/guide/en/elasticsearch/reference/current/full-text-queries.html
The rules for full text queries is
First it looks for search_analyzer in query
If not mentioned then it uses index time analyzer for that field for searching.
So in this case you need to change your query to this
BoolQueryBuilder query = new BoolQueryBuilder();
query.must(new QueryStringQueryBuilder("tagName:*June*"));
query.must(new TermQueryBuilder("employeeid", 358));
SearchResponse response12 = esclient.prepareSearch(index).setTypes("emptagnames")
.setQuery(query)
.execute().actionGet();
SearchHit[] hits2 = response12.getHits().getHits();
System.out.println(hits2.length);
for (SearchHit hit : hits2) {
Map map = hit.getSource();
System.out.println((String) map.get("tagName"));
}

Mongodb Morphia aggregation

I'm having trouble creating aggregation in Morphia, the documentation is really not clear. This is the original query:
db.collection('events').aggregate([
{
$match: {
"identifier": {
$in: [
userId1, userId2
]
},
$or: [
{
"info.name": "messageType",
"info.value": "Push",
"timestamp": {
$gte: newDate("2015-04-27T19:53:13.912Z"),
$lte: newDate("2015-08-27T19:53:13.912Z")
}
}
]
}{
$unwind: "$info"
},
{
$match: {
$or: [
{
"info.name": "messageType",
"info.value": "Push"
}
]
}
]);
The only example in their docs was using out and there's some example here but I couldn't make it to work.
I didn't even made it past the first match, here's what I have:
ArrayList<String> ids = new ArrayList<>();
ids.add("199941");
ids.add("199951");
Query<Event> q = ads.getQueryFactory().createQuery(ads);
q.and(q.criteria("identifier").in(ids));
AggregationPipeline pipeline = ads.createAggregation(Event.class).match(q);
Iterator<Event> iterator = pipeline.aggregate(Event.class);
Some help or guidance and how to start with the query or how it works will be great.

You need to create the query for the match() pipeline by breaking your code down into manageable pieces that will be easy to follow. So let's start
with the query to match the identifier field, you have done the great so far. We need to then combine with the $or part of the query.
Carrying on from where you left, create the full query as:
Query<Event> q = ads.getQueryFactory().createQuery(ads);
Criteria[] arrayA = {
q.criteria("info.name").equal("messageType"),
q.criteria("info.value").equal("Push"),
q.field("timestamp").greaterThan(start);
q.field("timestamp").lessThan(end);
};
Criteria[] arrayB = {
q.criteria("info.name").equal("messageType"),
q.criteria("info.value").equal("Push")
};
q.and(
q.criteria("identifier").in(ids),
q.or(arrayA)
);
Query<Event> query = ads.getQueryFactory().createQuery(ads);
query.or(arrayB);
AggregationPipeline pipeline = ads.createAggregation(Event.class)
.match(q)
.unwind("info")
.match(query);
Iterator<Event> iterator = pipeline.aggregate(Event.class);
The above is untested but will guide you somewhere closer home, so make some necessary adjustments where appropriate. For some references, the following SO questions may give you some pointers:
Complex AND-OR query in Morphia
Morphia query with or operator
and of course the AggregationTest.java Github page

How to write elasticsearch query aggregation in java?

This is my code in Marvel Sense:
GET /sweet/cake/_search
{
"query": {
"bool": {
"must": [
{"term": {
"code":"18"
}}
]
}
},
"size": 0,
"aggs": {
"group_by_state": {
"terms": {
"field": "id"
}
}
}
}
And I want to write it in Java but I dont't know how.

You can find some examples in the official documentation for the Java client.
But in your case, you need to create one bool/must query using the QueryBuilders and one terms aggregation using the AggregationBuilders. It goes like this:
// build the query
BoolQueryBuilder query = QueryBuilders.boolFilter()
.must(QueryBuilders.termFilter("code", "18"));
// build the terms sub-aggregation
TermsAggregation stateAgg = AggregationBuilders.terms("group_by_state")
.field("id");
SearchResponse resp = client.prepareSearch("sweet")
.setType("cake")
.setQuery(query)
.setSize(0)
.addAggregation(stateAgg)
.execute()
.actionGet();

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to add Bucket Sort to Query Aggregation - java

Related

Elasticsearch Subaggregation of tophits JAVA API not working

Two Aggregate Totals in One Group

java : Case insensitive search in Elasticsearch

Mongodb Morphia aggregation

How to write elasticsearch query aggregation in java?

Categories

Resources