Top Hits Aggregation support in Jest SearchResult - java

I am using below function to make aggregation query:
private TermsBuilder getAggregations(String[] outputFields) {
TermsBuilder topLevelAggr = AggregationBuilders.terms("level1").field("field1").size(0);
TermsBuilder aggr2 = AggregationBuilders.terms("level2").field("field2").size(0);
TermsBuilder aggr3 = AggregationBuilders.terms("level3").field("field3").size(0);
TermsBuilder aggr4 = AggregationBuilders.terms("level4").field("field4").size(0);
TopHitsBuilder topHitsBuilder = AggregationBuilders.topHits("doc").setSize(1).addSort("fieldValue", SortOrder.DESC);
aggr4.subAggregation(topHitsBuilder);
aggr3.subAggregation(aggr4);
aggr2.subAggregation(aggr3);
topLevelAggr.subAggregation(aggr2);
topHitsBuilder.setFetchSource(outputFields, new String[]{});
return topLevelAggr;
}
I am getting the correct aggregation query from this code, but after executing the query I am not able to extract the top_hits aggregation result. I am using
io.searchbox.core.SearchResult class to get the query result. In this class I couldn't find any support for Top_Hits aggregation.
Please help. Thanks.

You can use the top hits aggregation java API.
Here's an example from elasticsearch docs:
// sr is here your SearchResponse object
Terms agg = sr.getAggregations().get("agg");
// For each entry
for (Terms.Bucket entry : agg.getBuckets()) {
String key = entry.getKey(); // bucket key
long docCount = entry.getDocCount(); // Doc count
logger.info("key [{}], doc_count [{}]", key, docCount);
// We ask for top_hits for each bucket
TopHits topHits = entry.getAggregations().get("top");
for (SearchHit hit : topHits.getHits().getHits()) {
logger.info(" -> id [{}], _source [{}]", hit.getId(), hit.getSourceAsString());
}
}

Related

Can we use cosmosContainer.queryItems() method to execute the delete query on cosmos container

I have a Java method in my code, in which I am using following line of code to fetch any data from azure cosmos DB
Iterable<FeedResponse<Object>> feedResponseIterator =
cosmosContainer
.queryItems(sqlQuery, queryOptions, Object.class)
.iterableByPage(continuationToken, pageSize);
Now the whole method looks like this
public List<LinkedHashMap> getDocumentsFromCollection(
String containerName, String partitionKey, String sqlQuery) {
List<LinkedHashMap> documents = new ArrayList<>();
String continuationToken = null;
do {
CosmosQueryRequestOptions queryOptions = new CosmosQueryRequestOptions();
CosmosContainer cosmosContainer = createContainerIfNotExists(containerName, partitionKey);
Iterable<FeedResponse<Object>> feedResponseIterator =
cosmosContainer
.queryItems(sqlQuery, queryOptions, Object.class)
.iterableByPage(continuationToken, pageSize);
int pageCount = 0;
for (FeedResponse<Object> page : feedResponseIterator) {
long startTime = System.currentTimeMillis();
// Access all the documents in this result page
page.getResults().forEach(document -> documents.add((LinkedHashMap) document));
// Along with page results, get a continuation token
// which enables the client to "pick up where it left off"
// in accessing query response pages.
continuationToken = page.getContinuationToken();
pageCount++;
log.info(
"Cosmos Collection {} deleted {} page with {} number of records in {} ms time",
containerName,
pageCount,
page.getResults().size(),
(System.currentTimeMillis() - startTime));
}
} while (continuationToken != null);
log.info(containerName + " Collection has been collected successfully");
return documents;
}
My question is that can we use same line of code to execute delete query like (DELETE * FROM c)? If yes, then what it would be returning us in Iterable<FeedResponse> feedResponseIterator object.
SQL statements can only be used for reads. Delete operations must be done using DeleteItem().
Here are Java SDK samples (sync and async) for all document operations in Cosmos DB.
Java v4 SDK Document Samples

How to print MongoDB aggregation queries in Java using spring-data-mongodb

I need to print the MongoDB aggregation query used in Java code (for debugging an empty response). I don't see a way to print the GroupOperation object, org.springframework.data.mongodb.core.aggregation.GroupOperation
I see this method:
public Document toDocument(AggregationOperationContext context), but I don't understand how to pass a value for context.
I need a way to print this object.
Here are my code snippets:
Query in Java code:
GroupOperation groupOperation = new GroupOperation(
Fields.fields("dbHost"))
.first("dbUser").as("dbUser")
.first("dbPassword").as("dbPassword");
System.out.println("groupOperation = " + groupOperation);
List<AggregationOperation> aggregationOperations = Lists.newArrayList(groupOperation);
Aggregation agg = Aggregation.newAggregation(aggregationOperations)
.withOptions(Aggregation.newAggregationOptions().allowDiskUse(true).build());
AggregationResults<Document> aggInfo = mongoTemplate.aggregate(agg, "schedulerInfo", Document.class);
List<Document> docs = aggInfo.getMappedResults();
System.out.println("docs = " + docs);
Output:
groupOperation = org.springframework.data.mongodb.core.aggregation.GroupOperation#1188e820
docs = []
Corresponding query in Mongo shell:
db.schedulerInfo.aggregate([{"$group": {"_id": "$dbHost", "dbUser": { "$first": "$dbUser" },"dbPassword": { "$first": "$dbPassword" }}}])

Find unique field values in ElasticSearch using Spring Data ElasticsearchRepository

I have an interface extending ElasticsearchRepository and have successfully created methods to search such as:
Page<AuditResult> findByCustomerCodeAndHost(String customerCode, String host, Pageable pageable);
Now, I want an endpoint to hit that would return me all of the possible host values for that customerCode so that I can build a dropdown list in my front end to select a value to send to that findByCustomerCodeAndHost endpoint, something like:
List<String> findUniqueHostByCustomerCode(String customerCode)
Is this even possible using an ElasticsearchRepository?
I know there is the Distinct keyword I can use like
List<String> findDistinctByCustomerCode(String customerCode); but this doesn't let me specify the host field.
Edit:
Here is how I accomplished what I wanted but as it is not currently possible to actually do this with ElasticsearchRepository it isn't an actual "answer".
I created a Spring web #RestController class that I exposed a #GetMapping REST endpoint that executed an aggregation query.
The query in kibana console:
GET auditresult/_search
{
"size": "0",
"aggs" : {
"uniq_custCode" : {
"terms" : { "field" : "customerCode", "include": "<CUSTOMER_CODE>" },
"aggs" : {
"uniq_host" : {
"terms" : { "field" : "host"}
}
}
}
}
}
And, based off this question ElasticSearch aggregation with Java I came up with
#GetMapping("/hosts/{customerCode}")
String getHostsByCustomer(#PathVariable String customerCode) {
SearchRequest searchRequest = new SearchRequest("auditresult");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder().size(0);
IncludeExclude ie = new IncludeExclude(customerCode, "");
TermsAggregationBuilder aggregation =
AggregationBuilders
.terms("uniq_custCode").includeExclude(ie)
.field("customerCode")
.subAggregation(
AggregationBuilders
.terms("uniq_host")
.field("host")
);
searchSourceBuilder.aggregation(aggregation);
searchRequest.source(searchSourceBuilder);
try {
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
Terms cust = searchResponse.getAggregations().get("uniq_custCode");
StringBuilder sb = new StringBuilder();
sb.append("{\"hosts\":[");
for (Terms.Bucket bucket : cust.getBuckets()) {
Terms hosts = bucket.getAggregations().get("uniq_hosts");
for (Terms.Bucket host : hosts.getBuckets()) {
System.out.println(host.getKey());
sb.append("\"" + host.getKey() + "\",");
}
}
String out = sb.toString();
out = out.substring(0, out.length() - 1);
return out + "]}";
} catch (IOException e) {
e.printStackTrace();
return "{\"hosts\":[]}";
}
}
What you would need here is something Spring Data calls projections, for Spring Data MongoDB you can read the documentation to see how it works there.
Alas this is not implemented in Spring Data Elasticsearch (yet), I created an issue in Jira for this.

QueryBuilder and BasicDBObjectBuilder usage in MongoDB 3.3.0 above

PART 1
Following up solutions to Querying Mongo Collection using QueryBuilder in Mongo 3.3.0
, I tried implementing the suggested methods to implement collection.find(). But was stuck with different parameters been passed a BasicDBObjectBuilder to the same as follows -
BasicDBObjectBuilder queryBuilder = BasicDBObjectBuilder.start();
query.getParams().entrySet().stream().forEach(entry -> queryBuilder.add(entry.getKey(), entry.getValue()));
BasicDBObjectBuilder outputQuery = BasicDBObjectBuilder.start();
outputQuery.add(nameKey, 1);
This doesn't compile :
FindIterable<TDocType> tDocTypeList = collection.find(queryBuilder.get(), outputQuery.get());
This also won't compile :
FindIterable<TDocType> tDocTypeList = collection.find((Bson)queryBuilder.get(), (Bson)outputQuery.get());
And this won't compile either :
org.bson.Document queryBuilder = new org.bson.Document();
query.getParams().entrySet().stream().forEach(entry -> queryBuilder.put(entry.getKey(), entry.getValue()));
org.bson.Document outputQuery = new org.bson.Document();
outputQuery.put(nameKey, 1);
FindIterable<TDocType> tDocTypeList = collection.find(queryBuilder, outputQuery);
Question - How do I specify a projection for the results required out of find() from collections?
PART 2
At one end I can simply replace mongo 3.0.4 java driver's code -
DBObject dbObject = collection.findOne(new QueryBuilder().put(ids).is(id).get())
to
Bson filter = Filters.eq(ids, id);
TDocType doc = collection.find(filter).first();
Now if we have an implementation where we build query through an iteration as in the sample code -
for(Map.Entry<String, Object> entry : query.getParams().entrySet()) {
// this is where its building the query
if(some condition) {
queryBuilder.put(entry.getKey()).is(entry.getValue());
}
if(some other condition) {
queryBuilder.put(entry.getKey()).in(query.getValues());
}
}
Question - Is there a way to implement such appending query Filters
with current mongo 3.3.0+ as well?
The second argument of find method is result type. Try as below.
FindIterable<TDocType> tDocTypeList = dbCollection.find(filter, TDocType.class);
Update for projection
FindIterable<TDocType> tDocTypeList = dbCollection.find(filter, TDocType.class).projection(outputQuery);
Update for appending filters
List<Bson> filters = new ArrayList<>();
for (Map.Entry<String, Object> entry : query.getParams().entrySet()) {
// this is where its building the query
if (some condition){
filters.add(Filters.eq(entry.getKey(), entry.getValue()));
}
if (some other condition){
filters.add(Filters.in(entry.getKey(), query.getValues()));
}
}
FindIterable<TDocType> docType = dbCollection.find(Filters.and(filters));

MongoDB - group by - aggregation - java

I have a doc in my mongodb that looks like this -
public class AppCheckInRequest {
private String _id;
private String uuid;
private Date checkInDate;
private Double lat;
private Double lon;
private Double altitude;
}
The database will contain multiple documents with the same uuid but different checkInDates
Problem
I would like to run a mongo query using java that gives me one AppCheckInRequest doc(all fields) per uuid who's checkInDate is closest to the current time.
I believe I have to the aggregation framework, but I can't figure out how to get the results I need. Thanks.
In the mongo shell :-
This will give you the whole groupings:
db.items.aggregate({$group : {_id : "$uuid" , value : { $push : "$somevalue"}}} )
And using $first instead of $push will only put one from each (which is what you want i think?):
db.items.aggregate({$group : {_id : "$uuid" , value : { $first : "$somevalue"}}} )
Can you translate this to the Java api? or i'll try to add that too.
... ok, here's some Java:
Assuming the docs in my collection are {_id : "someid", name: "somename", value: "some value"}
then this code shows them grouped by name:
Mongo client = new Mongo("127.0.0.1");
DBCollection col = client.getDB("ajs").getCollection("items");
AggregationOutput agout = col.aggregate(
new BasicDBObject("$group",
new BasicDBObject("_id", "$name").append("value", new BasicDBObject("$push", "$value"))));
Iterator<DBObject> results = agout.results().iterator();
while(results.hasNext()) {
DBObject obj = results.next();
System.out.println(obj.get("_id")+" "+obj.get("value"));
}
and if you change $push to $first, you'll only get 1 per group. You can then add the rest of the fields once you get this query working.

Categories