How to translate an ElasticSearch multimatch search query from cURL into JAVA? - java

So, having made up the (right) query in ES and tested it against a local ES installation using the sense plugin, I am now facing the problem: How to do the same from my code using the ES JAVA API. Here is the query I am trying to translate:
{
"size": 5,
"query": {
"multi_match": {
"query": "physics",
"type": "most_fields",
"fields": [
"document.title^10",
"document.title.shingles^2",
"document.title.ngrams",
"person.name^10",
"person.name.shingles^2",
"person.name.ngrams",
"document.topics.name^10",
"document.topics.name.shingles^2",
"document.topics.name.ngrams"
],
"operator": "and"
}
}
}'
I know it should be something like this, but I am not quite sure:
Node node = nodeBuilder().client(true).node();
Client client = node.client();
SearchResponse response = client.prepareSearch("dlsnew")
.setTypes("person", "document")
.setSearchType(SearchType.DFS_QUERY_THEN_FETCH)
.setQuery(QueryBuilders.multiMatchQuery("physics",
"document.title^10",
"document.title.shingles^2",
"document.title.ngrams",
"person.name^10",
"person.name.shingles^2",
"person.name.ngrams",
"document.topics.name^10",
"document.topics.name.shingles^2",
"document.topics.name.ngrams"))
.setFrom(0).setSize(5).setExplain(true)
.execute()
.actionGet();
SearchHit[] results = response.getHits().getHits();
Also, how to handle the "operator" and "type":"most_fields" parts from the query?

You almost did it
QueryBuilders.multiMatchQuery("physics",
"document.title^10",
"document.title.shingles^2",
"document.title.ngrams",
"person.name^10",
"person.name.shingles^2",
"person.name.ngrams",
"document.topics.name^10",
"document.topics.name.shingles^2",
"document.topics.name.ngrams")
.operator(MatchQueryBuilder.Operator.AND)
.type(MultiMatchQueryBuilder.Type.MOST_FIELDS);

Related

How to convert the elastic search json query into equivalent Java API?

I have the below elastic search JSON Query and want to convert it into equivalent Java API. How can I convert this with Elastic Search Java API?
{
"size": 0,
"query": {
"match_all": {}
},
"aggs": {
"min": {
"min": {
"field": "<The date>"
}
},
"max":{
"max": {
"field": "<The date>"
}
}
}
}
I had tried using MaxAggregationBuilder and MinAggregationBuilder, but in that case I had to do two seperate API calls , one for Max and the another one for Min.
MaxAggregationBuilder=AggregationBuilders.max("max").field("date");
MinAggregationBuilder=AggregationBuilders.max("min").field("date");
How can I do this in one API call itself?
Those two statements are not two API calls, they are just call statements of a builder that builds up the query that you're going to send in one API call:
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
// query part
searchSourceBuilder.query(QueryBuilders.matchAllQuery());
// aggregation part
searchSourceBuilder.aggregation(AggregationBuilders.max("max").field("date"));
searchSourceBuilder.aggregation(AggregationBuilders.max("min").field("date"));
// request part
SearchRequest searchRequest = new SearchRequest();
searchRequest.source(searchSourceBuilder);
// API call
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);

How to add Bucket Sort to Query Aggregation

I have a ElasticSearch Query that is working well (curl), is my first Query,
First I am filtering by Organization (Multitenancy), then group by Customer, Finally sum the amount of the sales but I only want to have the 3 best customers.
My question is.. How to build the aggregation with the AggregationBuilders to get "bucket_sort" statement. I got the sales grouping by customer with Java API.
Elastic Query is:
curl -X POST 'http://localhost:9200/sales/sale/_search?pretty' -H 'Content-Type: application/json' -d '
{
"aggs": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"organization_id": "15"
}
}
]
}
},
"aggs": {
"by_customer": {
"terms": {
"field": "customer_id"
},
"aggs": {
"sum_total" : {
"sum": {
"field": "amount"
}
},
"total_total_sort": {
"bucket_sort": {
"sort": [
{"sum_total": {"order": "desc"}}
],
"size": 3
}
}
}
}
}
}
}
}'
My Java Code:
#Test
public void queryBestCustomers() throws UnknownHostException {
Client client = Query.client();
AggregationBuilder sum = AggregationBuilders.sum("sum_total").field("amount");
AggregationBuilder groupBy = AggregationBuilders.terms("by_customer").field("customer_id").subAggregation(sum);
AggregationBuilder aggregation =
AggregationBuilders
.filters("filtered",
new FiltersAggregator.KeyedFilter("must", QueryBuilders.termQuery("organization_id", "15"))).subAggregation(groupBy);
SearchRequestBuilder requestBuilder = client.prepareSearch("sales")
.setTypes("sale")
.addAggregation(aggregation);
SearchResponse response = requestBuilder.execute().actionGet();
}
I hope I got your question right.
Try adding "order" to your groupBy agg:
AggregationBuilder groupBy = AggregationBuilders.terms("by_customer").field("customer_id").subAggregation(sum).order(Terms.Order.aggregation("sum_total", false));
One more thing, if you want the top 3 clients than your .size(3) should be set on groupBy agg as well and not on sorting. like that:
AggregationBuilder groupBy = AggregationBuilders.terms("by_customer").field("customer_id").subAggregation(sum).order(Terms.Order.aggregation("sum_total", false)).size(3);
As another answer mentioned, "order" does work for your use case.
However there are other use cases where one may want to use bucket_sort. For example if someone wanted to page through the aggregation buckets.
As bucket_sort is a pipeline aggregation you cannot use the AggregationBuilders to instantiate it. Instead you'll need to use the PipelineAggregatorBuilders.
You can read more information about the bucket sort/pipeline aggregation here.
The ".from(50)" in the following code is an example of how you can page through the buckets. This causes the items in the bucket to start from item 50 if applicable. Not including "from" is the equivalent of ".from(0)"
BucketSortPipelineAggregationBuilder paging = PipelineAggregatorBuilders.bucketSort(
"paging", List.of(new FieldSortBuilder("sum_total").order(SortOrder.DESC))).from(50).size(10);
AggregationBuilders.terms("by_customer").field("customer_id").subAggregation(sum).subAggregation(paging);

java : Case insensitive search in Elasticsearch

I'm trying to find out the documents in the index regardless of whether if it's field values are lowercase or uppercase in the index.
This is the index structure, I have designed with the custom analyzer. I'm new to analyzers and I might be wrong. This is how it looks :
POST arempris/emptagnames
{
"settings": {
"analyzer": {
"lowercase_keyword": {
"type": "custom",
"tokenizer": "keyword",
"filter": "lowercase"
}
}
},
"mappings" : {
"emptags":{
"properties": {
"employeeid": {
"type":"integer"
},
"tagName": {
"type": "text",
"fielddata": true,
"analyzer": "lowercase_keyword"
}
}
}
}
}
In the java back-end, I'm using BoolQueryBuilder to find tagnames using employeeids first. This is what I've coded to fetch the values :
BoolQueryBuilder query = new BoolQueryBuilder();
query.must(new WildcardQueryBuilder("tagName", "*June*"));
query.must(new TermQueryBuilder("employeeid", 358));
SearchResponse response12 = esclient.prepareSearch(index).setTypes("emptagnames")
.setQuery(query)
.execute().actionGet();
SearchHit[] hits2 = response12.getHits().getHits();
System.out.println(hits2.length);
for (SearchHit hit : hits2) {
Map map = hit.getSource();
System.out.println((String) map.get("tagName"));
}
It works fine when I specify the tag to be searched as "june" in lowercase, but when I specify it as "June" in the WildCardQueryBuilder with an uppercase for an alphabet, I'm not getting any match.
Let me know where have I committed the mistake. Would greatly appreciate your help and thanks in advance.
There are two type of queries in elasticsearch
Term level queries -> in which exact term is searched. https://www.elastic.co/guide/en/elasticsearch/reference/current/term-level-queries.html
Full text queries -> which first analyzes the query term and then search it. https://www.elastic.co/guide/en/elasticsearch/reference/current/full-text-queries.html
The rules for full text queries is
First it looks for search_analyzer in query
If not mentioned then it uses index time analyzer for that field for searching.
So in this case you need to change your query to this
BoolQueryBuilder query = new BoolQueryBuilder();
query.must(new QueryStringQueryBuilder("tagName:*June*"));
query.must(new TermQueryBuilder("employeeid", 358));
SearchResponse response12 = esclient.prepareSearch(index).setTypes("emptagnames")
.setQuery(query)
.execute().actionGet();
SearchHit[] hits2 = response12.getHits().getHits();
System.out.println(hits2.length);
for (SearchHit hit : hits2) {
Map map = hit.getSource();
System.out.println((String) map.get("tagName"));
}

Elasticsearch Java API MoreLikeThis not returning documents compared to "_search" rest endpoint

Intention:
Elasticsearch Java MoreLikeThis query in Java to do exactly what the below raw more_like_this filtered query via the /_search rest endpoint is doing.
GET /index/type/_search
{
"query": {
"filtered": {
"query": {
"more_like_this": {
"fields": [
"title",
"body",
"description",
"organisations",
"locations"
],
"min_term_freq": 2,
"max_query_terms": 25,
"ids": [
"http://xxx/doc/doc"
]
}
},
"filter": {
"range": {
"datePublished": {
"gte": "2016-01-01T12:30:00+01:00"
}
}
}
}
},
"fields": [
"title",
"description",
"datePublished"
]
}
And this is my Java implementation for the above:
FilteredQueryBuilder queryBuilder = new FilteredQueryBuilder(QueryBuilders.matchAllQuery(),FilterBuilders.rangeFilter("datePublished").gte(("2016-01-01T12:30:00+01:00")));
SearchSourceBuilder query = SearchSourceBuilder.searchSource().query(queryBuilder);
return client.prepareMoreLikeThis("index", "type", "http://xxx/doc/doc")
.setField("title", "description", "body", "organisations","locations")
.setMinTermFreq(2)
.maxQueryTerms(25)
.setSearchSource(query);
However, the results far differ from the more_like_this rest endpoint was returning. I am getting matches of about 4/5th of my whole documents in the index. As if none of the filters are being applied
Targeting ES v1.4.2 and v1.6.2
Any advice please.Thanks
I got the desire results with QueryBuilders.moreLikeThisQuery(). Inspirations from this post here.
FilterBuilder filterBuilder = FilterBuilders.rangeFilter("datePublished")
.gte("2016-01-01T12:30:00+01:00")
.includeLower(false).includeUpper(false);
MoreLikeThisQueryBuilder mltQueryBuilder = QueryBuilders.moreLikeThisQuery("title", "description", "body", "organisations","locations")
.minTermFreq(2)
.maxQueryTerms(25)
.ids("http://xxx/doc/doc");
SearchRequestBuilder searchRequestBuilder = client.prepareSearch("index");
searchRequestBuilder.setTypes("type");
searchRequestBuilder.addFields("title","description","datePublished");
searchRequestBuilder.setQuery(mltQueryBuilder).setPostFilter(filterBuilder);
searchRequestBuilder.execute().actionGet()
Notes:
QueryBuilders seems to be the way forward in terms of compatibility with ES v2.0 and beyound
#MoreLikeThisRequestBuilder will be deprecated in ES v1.6 + and removed in 2.0

How to write elasticsearch query aggregation in java?

This is my code in Marvel Sense:
GET /sweet/cake/_search
{
"query": {
"bool": {
"must": [
{"term": {
"code":"18"
}}
]
}
},
"size": 0,
"aggs": {
"group_by_state": {
"terms": {
"field": "id"
}
}
}
}
And I want to write it in Java but I dont't know how.
You can find some examples in the official documentation for the Java client.
But in your case, you need to create one bool/must query using the QueryBuilders and one terms aggregation using the AggregationBuilders. It goes like this:
// build the query
BoolQueryBuilder query = QueryBuilders.boolFilter()
.must(QueryBuilders.termFilter("code", "18"));
// build the terms sub-aggregation
TermsAggregation stateAgg = AggregationBuilders.terms("group_by_state")
.field("id");
SearchResponse resp = client.prepareSearch("sweet")
.setType("cake")
.setQuery(query)
.setSize(0)
.addAggregation(stateAgg)
.execute()
.actionGet();

Categories