Aggregation in Elasticsearch - java

I want to apply group by clause on date field for elasticsearch query. This is my code.
SearchRequestBuilder srb = client
.prepareSearch(ConstantsValue.indexName)
.setTypes(ConstantsValue._Type)
.addAggregation(
AggregationBuilders
.dateHistogram("aggs")
.field("DTCREATED")
.interval(Interval.MONTH)
.format("yyyy-MM-dd")
.preZone("+05:30")
.preZoneAdjustLargeInterval(true)
.minDocCount(1)
)
.setSize(Integer.MAX_VALUE)
.setQuery(query);
SearchResponse response = srb
.setSearchType(SearchType.QUERY_AND_FETCH)
.setFetchSource(ConstantsValue.fieldList, null)
.execute()
.actionGet();
But query does not return expected result.
Result displayed is as follows
Value :{"DTCREATED":"2016-09-29T18:30:00.000Z"}
Key :AVfdaeSC3n3Bn-RaoFgg
Value :{"DTCREATED":"2016-09-29T18:30:00.000Z"}
Key :AVfdaeSC3n3Bn-RaoFgl
Value :{"DTCREATED":"2016-09-29T18:30:00.000Z"}
Key :AVfdaeSC3n3Bn-RaoFgq
Value :{"DTCREATED":"2016-08-31T18:30:00.000Z"}
Key :AVfdaeSC3n3Bn-RaoFgv
Value :{"DTCREATED":"2016-09-06T18:30:00.000Z"}
Key :AVfdaeSC3n3Bn-RaoFg0
Value :{"DTCREATED":"2016-09-22T18:30:00.000Z"}
Key :AVfdaeSC3n3Bn-RaoFg5
Value :{"DTCREATED":"2016-09-22T18:30:00.000Z"}
Key :AVfdaeSC3n3Bn-RaoFhA
Value :{"DTCREATED":"2016-09-12T18:30:00.000Z"}
Key :AVfdaeSC3n3Bn-RaoFhF
I am new in elasticsearch and don't know what I am missing.
Any help is greatly appreciated!

There's no groupby like clause in ES but then you could use the Aggregations in order to group by the field you want. For example I'm using the post http request below in order to group using userid and get the count for each userid.
The search query would look like this:
http://localhost:9200/response_summary/_search
In the above, response_summary is the index. i'm trying do the search.
The body of the request can be something like this:
{
"query":{
"query_string":{
"query":"api:\"smsmessaging\" AND operatorid:\"ROBI\""
}
},
"aggs":{
"total":{
"terms":{
"field":"userid"
},
"aggs":{
"grades_count":{
"value_count":{
"script":"doc['userid'].value"
}
}
}
}
}
}
So you could mention the field you wanted to groupby within the aggs tag and get the count as a sample in the above. You could modify as you wish. Could have a look at this thread as well.

Related

Retrieve Data from DynamoDB if complete sort key not known

I am new to DyanmoDB. I am creating partition key and sort key when pushing data into DynamoDb but when i want to retrieve the data i have the partition key but not the complete sort key. I know the beginning of the sort key but not the complete key.
table.query(QueryEnhancedRequest.builder().queryConditional(QueryConditional.keyEqualTo(Key.builder().partitionValue("KEY#" + id).build())).build())
Below are the tables partition and sort key:
private static final String TEMPLATE = "%s#%s";
#DynamoDbPartitionKey
#Override
public String getPk() {
return String.format(TEMPLATE, "KEY", getId());
}
#DynamoDbSortKey
#Override
public String getSk() {
return String.format(TEMPLATE, "KEY_SORT", getName());
}
I used what i provided above but its showing this error:
The provided key element does not match the schema (Service: DynamoDb, Status Code: 400, Request ID: response id)
After looking into the issue i found out that key should be the combination of partition and sort key. But the issue is i don't know the complete sort key for the second request.
You need to use QueryConditional with sortBeginsWith():
For example, imagine you have the following information
partition key pk = "key#23456"
sort key sk begins with "abcd"
you are unsure of the remainder of the sort key:
QueryConditional condition = QueryConditional.sortBeginsWith(
Key.builder()
.partitionValue("key#23456")
.sortValue("abcd")
.build()
);
QueryEnhancedRequest request = QueryEnhancedRequest.builder()
.queryConditional(condition)
.build();
table.query(request);
If you only know the beginning of the sort key, you can use BEGINS_WITH query.
aws dynamodb query \
--table-name TABLE \
--key-condition-expression "Id = :id and begins_with(Key, :key)" \
--expression-attribute-values '{":id":{"S":"1"}, ":key":{"S":"KEY-BEGINNING"}}'
You need to provide the primary key and the beginning of the sort key for this query.
See Key condition expressions for query

(JAVA, Elasticsearch) How can I get fields from SearchResponse?

I just wonder how I get fields from SearchResponse which is result of my query.
Below is my query:
{"size":99,"timeout":"10s","query":{"bool":{"filter":[{"bool":{"must":[{"range":{"LOG_GEN_TIME":{"from":"2018-11-01 12:00:01+09:00","to":"2018-11-01 23:59:59+09:00","include_lower":true,"include_upper":true,"boost":1.0}}},{"wrapper":{"query":"eyAiYm9vbCIgOiB7ICJtdXN0IiA6IFsgeyAidGVybSIgOiB7ICJBU1NFVF9JUCIgOiAiMTAuMTExLjI1Mi4xNiIgfSB9LCB7ICJ0ZXJtIiA6IHsgIkFDVElPTl9UWVBFX0NEIiA6ICIyIiB9IH0sIHsgInRlcm0iIDogeyAiRFNUX1BPUlQiIDogIjgwIiB9IH0gXSB9IH0="}}],"adjust_pure_negative":true,"boost":1.0}}],"adjust_pure_negative":true,"boost":1.0}},"_source":{"includes":["LOG_GEN_TIME","LOG_NO","ASSET_NO"],"excludes":[]},"sort":[{"LOG_GEN_TIME":{"order":"desc"}},{"LOG_NO":{"order":"desc"}}]}
and when I query this, like below:
SearchResponse searchResponse = request.get();
I got right result:
{
"took":1071,
"timed_out":false,
"_shards":{
"total":14,
"successful":14,
"skipped":0,
"failed":0
},
"_clusters":{
"total":0,
"successful":0,
"skipped":0
},
"hits":{
"total":2,
"max_score":null,
"hits":[
{
"_index":"log_20181101",
"_type":"SEC",
"_id":"1197132746951492963",
"_score":null,
"_source":{
"ASSET_NO":1,
"LOG_NO":1197132746951492963,
"LOG_GEN_TIME":"2018-11-01 09:46:28+09:00"
},
"sort":[
1541033188000,
1197132746951492963
]
},
{
"_index":"log_20181101",
"_type":"SEC",
"_id":"1197132746951492963",
"_score":null,
"_source":{
"ASSET_NO":2,
"LOG_NO":1197337264704454700,
"LOG_GEN_TIME":"2018-11-01 23:00:06+09:00"
},
"sort":[
1541080806000,
1197337264704454700
]
}
]
}
}
To use this result, I need to map this by field and value.
I think there's a way to map the field and value to the 'fields' parameter so that we could use it nicely, but I cannot find.
I hope I can use the result like this way:
SearchHit hit = ...
Map<String, SearchHitField> fields = hit.getFields();
String logNo = fields.get("LOG_NO").value();
And It seems like this is the common way to use..
Or am I misunderstanding something? Tell me other way if there's better way, please.
Any comment would be appreciated. Thanks.
I'm not clear what client you are using to query elastic. If you are using elasticsearch high level rest client then you can loop through hits and to get source you can use hit.getSourceAsMap() to get the key value of fields.
For your comment:
Firstly create a POJO class which corresponds to _source (i.e. index properties; the way data is store in elastic)
The use hit.getSourceAsString() to get _source in json format.
Use jackson ObjectMapper to map json to your pojo
Assuming you created a POJO class AssetLog
SearchHit[] searchHits = searchResponse.getHits().getHits();
for (SearchHit searchHit : searchHits) {
String hitJson = searchHit.getSourceAsString();
ObjectMapper objectMapper = new ObjectMapper();
AssetLog source = objectMapper.readValue(hitJson, AssetLog.class);
//Store source to map/array
}
Hope this helps.

elasticsearch query on comparing 2 fields (using java)

I've an index in my elasticsearch and I want to have a query to compare 2 date fields.
assuming fields name are creationDate and modifiedDate. I want to get all documents which these 2 dates are the same in them.
I know it was possible to use FilteredQuery which is deprecated right now.
something like the bellowing code:
FilteredQueryBuilder query = QueryBuilders.filteredQuery(null,
FilterBuilders.scriptFilter("doc['creationDate'].value = doc['modifiedDate'].value"));
Also it's maybe possible to write manual scripts as string, but I doubt that this is the right solution. Any idea's to create the properly query would be appreciated.
Filtered query have been replaced by bool/filter queries You can do it like this:
BoolQueryBuilder bqb = QueryBuilders.boolQuery()
filter(QueryBuilders.scriptQuery("doc['creationDate'].value = doc['modifiedDate'].value"));
However, instead of using scripts at search time, you'd be better off creating a new field at indexing time that contains the information of whether creationDate and modifiedDate are the same dates. Then, you could simply check that flag at query time, it would be much more optimal and fast.
If you don't want to reindex all your data, you can update all of them with that flag, simply run an update by query like this:
POST my-index/_update_by_query
{
"script": {
"source": """
def creationDate = Instant.parse(ctx._source.creationDate);
def modifiedDate = Instant.parse(ctx._source.modifiedDate);
ctx._source.modified = ChronoUnit.MICROS.between(creationDate, modifiedDate) > 0;
""",
"lang": "painless"
},
"query": {
"match_all": {}
}
}
And then your query will simply be
BoolQueryBuilder bqb = QueryBuilders.boolQuery()
filter(QueryBuilders.termQuery("modified", "false");

Java Elastic Search: Highlighter not working

I'm using the Java API for ElasticSearch. I'm attempting to highlight my fields but it's not working. The correct results that match the search term are being returned, so there is content to highlight, but it simply won't do it. I set my SearchResponse and HighlightBuilder like this:
QueryBuilder matchQuery = simpleQueryStringQuery(searchTerm);
...
HighlightBuilder highlightBuilder = new HighlightBuilder()
.postTags("<highlight>")
.preTags("</highlight>")
.field("description");
SearchResponse response = client.prepareSearch("mediaitems")
.setTypes("mediaitem")
.setSearchType(SearchType.DFS_QUERY_THEN_FETCH)
.setQuery(matchQuery) // Query
.setFrom(from)
.setSize(pageSize)
.setExplain(true)
.highlighter(highlightBuilder)
.get();
and in my JSON->POJO code, I check to see which fields have been highlighted, but the returned Map is empty.
Arrays.stream(hits).forEach((SearchHit hit) -> {
String source = hit.getSourceAsString();
Map<String, HighlightField> highlightFields = hit.getHighlightFields();
try {
MediaItem mediaItem = objectMapper.readValue(source, MediaItem.class);
mediaItemList.add(mediaItem);
} catch (IOException e) {
e.printStackTrace();
}
});
Why on earth is my highlighting request being ignored?
Any help is greatly appreciated.
You have to set the highlighted field in HighlightBuilder.
For example:
HighlightBuilder.Field field = new HighlightBuilder.Field(fieldName);
highlightBuilder.field(field);
I saw you are using simple query string query, so you can do the following:
Your query string: fieldname: searched text
So for example your query string is the following:
price: >2000 && city: Manchaster
With this query string you specified the fields in the query too.
Now highlighter should work.

Google datastore - querying on key values

I have a EntityKind SuggestedInterest.
When I populate that with a key "GrpId" and property "suggestedint".
Now, I need the "suggestedint" value for a requested "GrpId"
So, I write the query as:
String findSuggestedInterest(String grpId)
{
DatastoreService datastore = DatastoreServiceFactory.getDatastoreService();
Filter filter = new FilterPredicate(Entity.KEY_RESERVED_PROPERTY,FilterOperator.EQUAL,grpId);
Query q0 = new Query("SuggestedInterest").setFilter(filter);
PreparedQuery pq0 = datastore.prepare(q0);
Entity result = pq0.asSingleEntity();
return result.getProperty("suggestedint").toString();
}
When I execute this code I get
java.lang.IllegalArgumentException: __key__ filter value must be a Key
The developer docs told to use Entity.KEY_RESERVED_PROPERTY to query on keys, but I guess I misunderstood. What is the correct way to query on key ?
You should pass it a Key instead of String:
Key grpKey = KeyFactory.createKey("SuggestedInterest", grpId)
then use it:
Filter filter =
new FilterPredicate(Entity.KEY_RESERVED_PROPERTY,FilterOperator.EQUAL,grpKey);

Categories