Find unique field values in ElasticSearch using Spring Data ElasticsearchRepository

Find unique field values in ElasticSearch using Spring Data ElasticsearchRepository - java

I have an interface extending ElasticsearchRepository and have successfully created methods to search such as:
Page<AuditResult> findByCustomerCodeAndHost(String customerCode, String host, Pageable pageable);
Now, I want an endpoint to hit that would return me all of the possible host values for that customerCode so that I can build a dropdown list in my front end to select a value to send to that findByCustomerCodeAndHost endpoint, something like:
List<String> findUniqueHostByCustomerCode(String customerCode)
Is this even possible using an ElasticsearchRepository?
I know there is the Distinct keyword I can use like
List<String> findDistinctByCustomerCode(String customerCode); but this doesn't let me specify the host field.
Edit:
Here is how I accomplished what I wanted but as it is not currently possible to actually do this with ElasticsearchRepository it isn't an actual "answer".
I created a Spring web #RestController class that I exposed a #GetMapping REST endpoint that executed an aggregation query.
The query in kibana console:
GET auditresult/_search
{
"size": "0",
"aggs" : {
"uniq_custCode" : {
"terms" : { "field" : "customerCode", "include": "<CUSTOMER_CODE>" },
"aggs" : {
"uniq_host" : {
"terms" : { "field" : "host"}
}
}
}
}
}
And, based off this question ElasticSearch aggregation with Java I came up with
#GetMapping("/hosts/{customerCode}")
String getHostsByCustomer(#PathVariable String customerCode) {
SearchRequest searchRequest = new SearchRequest("auditresult");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder().size(0);
IncludeExclude ie = new IncludeExclude(customerCode, "");
TermsAggregationBuilder aggregation =
AggregationBuilders
.terms("uniq_custCode").includeExclude(ie)
.field("customerCode")
.subAggregation(
AggregationBuilders
.terms("uniq_host")
.field("host")
);
searchSourceBuilder.aggregation(aggregation);
searchRequest.source(searchSourceBuilder);
try {
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
Terms cust = searchResponse.getAggregations().get("uniq_custCode");
StringBuilder sb = new StringBuilder();
sb.append("{\"hosts\":[");
for (Terms.Bucket bucket : cust.getBuckets()) {
Terms hosts = bucket.getAggregations().get("uniq_hosts");
for (Terms.Bucket host : hosts.getBuckets()) {
System.out.println(host.getKey());
sb.append("\"" + host.getKey() + "\",");
}
}
String out = sb.toString();
out = out.substring(0, out.length() - 1);
return out + "]}";
} catch (IOException e) {
e.printStackTrace();
return "{\"hosts\":[]}";
}
}

What you would need here is something Spring Data calls projections, for Spring Data MongoDB you can read the documentation to see how it works there.
Alas this is not implemented in Spring Data Elasticsearch (yet), I created an issue in Jira for this.

Related

Create TermsQuery with List<String> using elasticsearch java api client

I have an elastic search query as below.
{
"query":{
"bool":{
"filter":{
"bool":{
"must_not":{
"terms":{
"names":[
"john",
"jose"
]
}
}
}
}
}
}
}
I am trying to build something like this in the code corresponding to the query.
BoolQuery.Builder builder = new BoolQueryBuilder();
List<String> names = ["john","jose"];
TermsQueryField field = new TermsQueryBuilder().value(names).build();
builder.mustNot(TermsQuery.of(t -> t.field("names").terms(field))._toQuery());
But I am getting error in this line as it is expecting List of FieldValue inside value function and not List of String.
TermsQueryField field = new TermsQueryBuilder().value(names).build();
Can someone help on this?

you need to use below code to create fieldValues for your names
List<FieldValue> fieldValues = names.stream().map(FieldValue::of).toList();

Query to search value in side array of object

I want to apply criteria inside object of array if it matches, but I am not able to find any documentation or example where I can find that using spring-data-cosmosdb library. I am using 2.3.0 version of library.
Example of Json
{
"id" : 1,
"address" : [
{
"street" : "abc"
...
},
{
"street" : "efg"
...
}
]
}
I wan to search all documents in which address is having street name equals "abc". Below is spring boot code that I am using to search in cosmosDb. But it is not returning expected results.
List<Criteria> criteriaList = new ArrayList<>();
criteriaList.add(Criteria.getInstance(CriteriaType.IN, "addresses.street", Collections.singletonList("abc")));
List<User> users = cosmosTemplate.find(new DocumentQuery(criteriaList.get(0), CriteriaType.AND)), User.class, COLLECTION_NAME);
I also tried with address[0].street, but it is throwing exception of operation not supported.

Strongly recommend upgrading to spring-data-cosmosdb v3 (at least version 3.22.0). The v2 connector has been legacy for some time. Using the latest connector, the below would accomplish your goal.
Criteria filterCriteria = Criteria.getInstance(CriteriaType.ARRAY_CONTAINS, "address",
Collections.singletonList(new ObjectMapper().readTree("{\"street\":\"abc\"}")),
Part.IgnoreCaseType.NEVER);
CosmosQuery cosmosQuery = new CosmosQuery(filterCriteria);
Iterable<User> results = cosmosTemplate.find(cosmosQuery, User.class, COLLECTION_NAME);
for (User user : results)
System.out.println("doc id: " + user.getId());

Top Hits Aggregation support in Jest SearchResult

I am using below function to make aggregation query:
private TermsBuilder getAggregations(String[] outputFields) {
TermsBuilder topLevelAggr = AggregationBuilders.terms("level1").field("field1").size(0);
TermsBuilder aggr2 = AggregationBuilders.terms("level2").field("field2").size(0);
TermsBuilder aggr3 = AggregationBuilders.terms("level3").field("field3").size(0);
TermsBuilder aggr4 = AggregationBuilders.terms("level4").field("field4").size(0);
TopHitsBuilder topHitsBuilder = AggregationBuilders.topHits("doc").setSize(1).addSort("fieldValue", SortOrder.DESC);
aggr4.subAggregation(topHitsBuilder);
aggr3.subAggregation(aggr4);
aggr2.subAggregation(aggr3);
topLevelAggr.subAggregation(aggr2);
topHitsBuilder.setFetchSource(outputFields, new String[]{});
return topLevelAggr;
}
I am getting the correct aggregation query from this code, but after executing the query I am not able to extract the top_hits aggregation result. I am using
io.searchbox.core.SearchResult class to get the query result. In this class I couldn't find any support for Top_Hits aggregation.
Please help. Thanks.

You can use the top hits aggregation java API.
Here's an example from elasticsearch docs:
// sr is here your SearchResponse object
Terms agg = sr.getAggregations().get("agg");
// For each entry
for (Terms.Bucket entry : agg.getBuckets()) {
String key = entry.getKey(); // bucket key
long docCount = entry.getDocCount(); // Doc count
logger.info("key [{}], doc_count [{}]", key, docCount);
// We ask for top_hits for each bucket
TopHits topHits = entry.getAggregations().get("top");
for (SearchHit hit : topHits.getHits().getHits()) {
logger.info(" -> id [{}], _source [{}]", hit.getId(), hit.getSourceAsString());
}
}

How to construct QueryBuilder from JSON DSL when using Java API in ElasticSearch?

I'm using ElasticSearch as a search service in Spring Web project which using Transport Client to communicate with ES.
I'm wondering if there exists a method which can construct a QueryBuilder from a JSON DSL. for example, convert this bool query DSL JSON to a QueryBuilder.
{
"query" : {
"bool" : {
"must" : { "match" : {"content" : "quick"},
"should": { "match": {"content" : "lazy"}
}
}
}
I need this method because I have to receive user's bool string input from web front-side, and parse this bool string to a QueryBuilder. However it not suit to use QueryBuilders.boolQuery().must(matchQB).should(shouldQB).must_not(mustNotQB). Because we may need several must or non must query.
If there exist a method can construct a QueryBuilder from JSON DSL or there exists alternative solutions, it will much easier.
PS: I have found two method which can wrap a DSL String to a QueryBuilder for ES search.
One is WrapperQueryBuilder, see details here. http://javadoc.kyubu.de/elasticsearch/HEAD/org/elasticsearch/index/query/WrapperQueryBuilder.html
Another is QueryBuilders.wrapperQuery(String DSL).

You can use QueryBuilders.wrapperQuery(jsonQueryString);

You can use setQuery, which can receive a json format string.
/**
* Constructs a new search source builder with a raw search query.
*/
public SearchRequestBuilder setQuery(String query) {
sourceBuilder().query(query);
return this;
}
Note this: only part of the DSL is needed, the {"query": } part is omitted, like this:
SearchResponse searchResponse = client.prepareSearch(indices).setQuery("{\"term\": {\"id\": 1}}").execute().actionGet();

It might be worth investigating low level rest client. With this you can do:
RestClient esClient = RestClient.builder(new HttpHost("localhost", 9200, "http")).build();
Request request = new Request("POST", "/INDEX_NAME/_doc/_search");
request.setJsonEntity(yourJsonQueryString);
Response response = esClient.performRequest(request);
String jsonResponse = EntityUtils.toString(response.getEntity());

Mongo DB Aggregate Query returns in Batches

I have the following code, :
CommandResult cr = db.doEval("db." + collectionName + ".aggregate("
+ query + ")");
Command result is giving in batches, where I need to get in single value.
Batch Result:{ "serverUsed" : "/servername" , "retval" : { **"_firstBatch**" : [ { "visitor_localdate" : 1367260200} , { "visitor_localdate"
Expected Result:
{ "serverUsed" : "/servername" , "retval" : { "**result**" : [ { "visitor_localdate" : 1367260200} , { "visitor_localdate"
The Mongo DB we are using is 2.6.4 with 64 bit.
Can any one help with this?. I am guessing there is some Configuration issue.

Your doing this all wrong. You don't need to jump through hoops like this just to get a dynamic collection name. Just use this syntax instead:
var collectionName = "collection";
var cursor = db[collectionName].aggregate( pipeline )
Where pipeline also is just the array of pipeline stage documents, ie:
var pipeline = [{ "$match": { } }, { "$group": { "_id": "$field" } }];
At any rate the .aggregate() method returns a cursor, you can iterate the results with standard methods:
while ( cursor.hasNext() ) {
var doc = cursor.next();
// do something with doc
}
But you are actually doing this in Java and not JavaScript, so from the base driver with a connection on object db you just do this:
DBObject match = new BasicDBObject("$match", new BasicDBObject());
DBObject group = new BasicDBObject("$group", new BasicDBObject());
List pipeline = new ArrayList();
pipeline.add(match);
pipeline.add(group);
AggregationOutput output = db.getCollection("collectionName").aggregate(pipeline);
The pipeline is basically a list interface of DBObject information where you construct the BSON documents representing the operations required.
The result here is of AggregationOutput, but cursor like results are obtainable by additionally supplying AggregationOptions as an additional option to pipeline

There was something related to bacth added in mongodb 2.6, more details here: http://docs.mongodb.org/manual/reference/method/db.collection.aggregate/#example-aggregate-method-initial-batch-size
From the link
db.orders.aggregate(
[
{ $match: { status: "A" } },
{ $group: { _id: "$cust_id", total: { $sum: "$amount" } } },
{ $sort: { total: -1 } },
{ $limit: 2 }
],
{
cursor: { batchSize: 0 }
}
)
You might be having a cursor batch in your aggregate query

The answer from Neil Lunn is not wrong but I want to add that the result you were expecting is a result for mongodb versions earlier than v2.6.
Before v2.6, the aggregate function returned just one document containing a result field, which holds an array of documents returned by the pipeline, and an ok field, which holds the value 1, indicating success.
However, from mongodb v2.6 on, the aggregate function returns a cursor (if $out option was not used).
See examples in mongodb v2.6 documentation and compare how it worked before v2.6 (i.e. in v2.4):

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Find unique field values in ElasticSearch using Spring Data ElasticsearchRepository - java

What you would need here is something Spring Data calls projections, for Spring Data MongoDB you can read the documentation to see how it works there. Alas this is not implemented in Spring Data Elasticsearch (yet), I created an issue in Jira for this.

Related

Create TermsQuery with List<String> using elasticsearch java api client

Query to search value in side array of object

Top Hits Aggregation support in Jest SearchResult

How to construct QueryBuilder from JSON DSL when using Java API in ElasticSearch?

Mongo DB Aggregate Query returns in Batches

Categories

Resources