Elasticsearch - Terms Aggregation nested field - java

I have following problem. I have a nested field ("list") with 2 properties (fieldB & fieldC).
This is how a document looks like:
"fieldA: "1",
"list": [
{"fieldB": "ABC",
"fieldC": "DEF"},
{"fieldB": "ABC",
"fieldC": "GHI"},
{"fieldB": "UVW",
"fieldC": "XYZ"},...]
},
I want to get a distinct list of all possible fieldC values for "ABC" (fieldB) over all documents. So far I've tried this in Java (Java REST Client):
SearchRequest searchRequest = new SearchRequest("abc*");
QueryBuilder matchQueryBuilder = QueryBuilders.boolQuery()
.must(QueryBuilders.nestedQuery("aList",
QueryBuilders.matchQuery("list.fieldB.keyword", "ABC"), ScoreMode.None));
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(matchQueryBuilder)
.aggregation(AggregationBuilders.nested("listAgg","list")
.subAggregation(AggregationBuilders.terms("fieldBAgg")
.field("list.fieldB.keyword")));
searchRequest.source(sourceBuilder);
SearchResponse searchResponse = null;
try {
searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
} catch (IOException e) {
e.printStackTrace();
}
Nested list = searchResponse.getAggregations().get("listAgg");
Terms fieldBs = list.getAggregations().get("fieldBAgg");
With that query I get all documents which include "ABC" in fieldB and I get all fieldC values. But I just want the fieldC values where fieldB is "ABC".
So in that example I get DEF, GHI and XYZ. But i just want DEF and GHI. Does anybody have an idea how to solve this?

The nested constraint in the query part will only select all documents that do have a nested field satisfying the constraint. You also need to add that same constraint in the aggregation part, otherwise you're going to aggregate all nested fields of all the selected documents, which is what you're seeing. Proceed like this instead:
// 1. terms aggregation on the desired nested field
nestedField = AggregationBuilders.terms("fieldBAgg").field("list.fieldC.keyword");
// 2. filter aggregation on the desired nested field value
onlyBQuery = QueryBuilders.termQuery("list.fieldB.keyword", "ABC");
onlyBFilter = AggregationBuilders.filter("onlyFieldB", onlyBQuery).subAggregation(nestedField);
// 3. parent nested aggregation
nested = AggregationBuilders.nested("listAgg", "list").subAggregation(onlyBFilter);
// 4. main query/aggregation
sourceBuilder.query(matchQueryBuilder).aggregation(nested);

Related

How to use elasticsearch rangequery to find values that are less than or equal to a number value in Java API

Hello I'm new to Elastic Search and I'm trying to build an elastic search query using Java API. I have the following.
int count = 7
QueryBuilder findRangeNumber = QueryBuilders.rangeQuery("numberField").lte(count);
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.must(findRangeNumber);
This return numberField that are equal to 12,6,4, and, 5. I want it to return values where numberField is less than or equal to count (so the number 6,5, and 4 in the example).
If I change number count to 12 it only return numberField that are equal to 12. I'm confuse to how this works and if it possible to it return any value of numberField that is less than or equal to count.
I also have tried the following with no luck
int count = 7
QueryBuilder findRangeNumber = QueryBuilders.rangeQuery("numberField").lte(count).gte(0);
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.must(findRangeNumber);
This is what the query look like when I print out boolQueryBuilder
"must" : [
{
"range" : {
"numberField" : {
"from" : null,
"to" : 7,
"include_lower" : true,
"include_upper" : true,
"boost" : 1.0
}
}
}
],
This is what the full code part look like, of what I'm trying to do.
BoolQueryBuilder boolQuery = new BoolQueryBuilder();
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
int count = 7
if (foo != null) {
boolQuery.should(QueryBuilders.matchQuery("something", matchSomething));
boolQuery.should(QueryBuilders.boolQuery().mustNot(QueryBuilders.existsQuery("foobar status")));
}
if (foobar != null) {
BoolQueryBuilder queryBuilder = new BoolQueryBuilder();
queryBuilder.should(QueryBuilders.rangeQuery("someDateField").from(dateField));
queryBuilder.should(QueryBuilders.rangeQuery("numberField").lte(count));
boolQuery.must(queryBuilder);
}
searchSourceBuilder.query(boolQuery);
Any help would be appreciated!
You need to use RangeQueryBuilder to find values that are less than or equal to a given number
Try out this below code
int count=7;
RestHighLevelClient client = new RestHighLevelClient(restClientBuilder);
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
RangeQueryBuilder rangeQueryBuilder = new RangeQueryBuilder("numberField").lte(count);
searchSourceBuilder.query(rangeQueryBuilder);
SearchRequest searchRequest = new SearchRequest("my-index");
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
Update 1:
When you are searching for documents having numberField less than equal to 7, you are getting documents having numberField value 12,6,4,5.
The search result coming is correct. This is possible only if you have not defined an explicit index mapping, due to which when you have indexed the documents having numberField, then it would have been indexed as of text type (by default) instead of numeric field type.
Now, lexically "12","6","4","5" all are smaller than "7", therefore you are getting all the documents in your search result. And when querying for "lte":12, you only get a document having numberField equal to 12. This is because there is no document (among these 4) that is lexically smaller than 12.
You can get the index mapping of your index by using Get Mapping API.
For the range query to work correctly you need to explicitly define the index mapping for your index, where numberField should be mapped to a numeric field datatype. You need to delete your index and create a new index with a new index mapping.
{
"mappings": {
"properties": {
"numberField": {
"type": "integer"
}
}
}
}

(JAVA, Elasticsearch) How can I get fields from SearchResponse?

I just wonder how I get fields from SearchResponse which is result of my query.
Below is my query:
{"size":99,"timeout":"10s","query":{"bool":{"filter":[{"bool":{"must":[{"range":{"LOG_GEN_TIME":{"from":"2018-11-01 12:00:01+09:00","to":"2018-11-01 23:59:59+09:00","include_lower":true,"include_upper":true,"boost":1.0}}},{"wrapper":{"query":"eyAiYm9vbCIgOiB7ICJtdXN0IiA6IFsgeyAidGVybSIgOiB7ICJBU1NFVF9JUCIgOiAiMTAuMTExLjI1Mi4xNiIgfSB9LCB7ICJ0ZXJtIiA6IHsgIkFDVElPTl9UWVBFX0NEIiA6ICIyIiB9IH0sIHsgInRlcm0iIDogeyAiRFNUX1BPUlQiIDogIjgwIiB9IH0gXSB9IH0="}}],"adjust_pure_negative":true,"boost":1.0}}],"adjust_pure_negative":true,"boost":1.0}},"_source":{"includes":["LOG_GEN_TIME","LOG_NO","ASSET_NO"],"excludes":[]},"sort":[{"LOG_GEN_TIME":{"order":"desc"}},{"LOG_NO":{"order":"desc"}}]}
and when I query this, like below:
SearchResponse searchResponse = request.get();
I got right result:
{
"took":1071,
"timed_out":false,
"_shards":{
"total":14,
"successful":14,
"skipped":0,
"failed":0
},
"_clusters":{
"total":0,
"successful":0,
"skipped":0
},
"hits":{
"total":2,
"max_score":null,
"hits":[
{
"_index":"log_20181101",
"_type":"SEC",
"_id":"1197132746951492963",
"_score":null,
"_source":{
"ASSET_NO":1,
"LOG_NO":1197132746951492963,
"LOG_GEN_TIME":"2018-11-01 09:46:28+09:00"
},
"sort":[
1541033188000,
1197132746951492963
]
},
{
"_index":"log_20181101",
"_type":"SEC",
"_id":"1197132746951492963",
"_score":null,
"_source":{
"ASSET_NO":2,
"LOG_NO":1197337264704454700,
"LOG_GEN_TIME":"2018-11-01 23:00:06+09:00"
},
"sort":[
1541080806000,
1197337264704454700
]
}
]
}
}
To use this result, I need to map this by field and value.
I think there's a way to map the field and value to the 'fields' parameter so that we could use it nicely, but I cannot find.
I hope I can use the result like this way:
SearchHit hit = ...
Map<String, SearchHitField> fields = hit.getFields();
String logNo = fields.get("LOG_NO").value();
And It seems like this is the common way to use..
Or am I misunderstanding something? Tell me other way if there's better way, please.
Any comment would be appreciated. Thanks.
I'm not clear what client you are using to query elastic. If you are using elasticsearch high level rest client then you can loop through hits and to get source you can use hit.getSourceAsMap() to get the key value of fields.
For your comment:
Firstly create a POJO class which corresponds to _source (i.e. index properties; the way data is store in elastic)
The use hit.getSourceAsString() to get _source in json format.
Use jackson ObjectMapper to map json to your pojo
Assuming you created a POJO class AssetLog
SearchHit[] searchHits = searchResponse.getHits().getHits();
for (SearchHit searchHit : searchHits) {
String hitJson = searchHit.getSourceAsString();
ObjectMapper objectMapper = new ObjectMapper();
AssetLog source = objectMapper.readValue(hitJson, AssetLog.class);
//Store source to map/array
}
Hope this helps.

retrieve docs from monogdb collection with field condition

I am trying to query a mongodb collection and retrieve certain documents based on a field value but also only retrieve a single field per record. I tried the following but no getting the solution I want:
MongoCollection<Document> collection =
database.getCollection("client_data");
//Document document = collection
// .find(new BasicDBObject("sampleUser", "myDb"))
//.projection(Projections.fields(Projections.include("address"),
//Projections.excludeId())).first();
BasicDBObject aQuery = new BasicDBObject();
aQuery.put("clientId",567);
FindIterable<Document> iterDoc = collection.find(aQuery);
The following retrieves all documents for clientid = 567, but I only want to show the address field.
The commented out code was also what I tried but not sure how to combine that with the query.
EDIT:
I am now able to iterate through all the results but would like to parse each document; I tried parsing the document into my class object but it immediately gives an error:
Unrecognized field "_id" (class
model.Client), not marked as ignorable
But _id is the very first field in the document:
Document{{_id=6216a7f64cedfd00011c35a5,
So I tried something else rather using the first document but then I don't know how to get the next document:
while(cursor.hasNext()) {
// System.out.println(cursor.next().toJson());
Client client = new Client();
try {
JsonParser jsonParser = new JsonFactory().createParser(cursor.next().toJson());
ObjectMapper mapper = new ObjectMapper();
ObjectNode rootNode = mapper.createObjectNode();
String customerInfo = fi.first().toJson();
JsonNode jobj = mapper.readTree(customerInfo);
// this gives the error// client = mapper.readValue(jsonParser,Client.class);
client.setId(jobj.path("_id").path("$oid").asText());
Please advise.
In order to:
retrieves all documents for clientid = 567, but I only want to show the address field
You would execute the following:
collection
.find(Filters.eq("clientId", 567))
.projection(Projections.fields(
Projections.include("address"),
Projections.excludeId())
).first()
Breaking it down:
.find(Filters.eq("clientId", 567)): apply the predicate 'where clientId = 567'
.projection(Projections.fields(Projections.include("address"), Projections.excludeId())): let the response include the address field and exclude the _id field

MongoDB: Query using $gte and $lte in java

I want to perform a query on a field that is greater than or equal to, AND less than or equal to(I'm using java btw). In other words. >= and <=. As I understand, mongoDB has $gte and $lte operators, but I can't find the proper syntax to use it. The field i'm accessing is a top-level field.
I have managed to get this to work:
FindIterable<Document> iterable = db.getCollection("1dag").find(new Document("timestamp", new Document("$gt", 1412204098)));
as well ass...
FindIterable<Document> iterable = db.getCollection("1dag").find(new Document("timestamp", new Document("$lt", 1412204098)));
But how do you combine these with each other?
Currently I'm playing around with a statement like this, but it does not work:
FindIterable<Document> iterable5 = db.getCollection("1dag").find(new Document( "timestamp", new Document("$gte", 1412204098).append("timestamp", new Document("$lte",1412204099))));
Any help?
Basically you require a range query like this:
db.getCollection("1dag").find({
"timestamp": {
"$gte": 1412204098,
"$lte": 1412204099
}
})
Since you need multiple query conditions for this range query, you can can specify a logical conjunction (AND) by appending conditions to the query document using the append() method:
FindIterable<Document> iterable = db.getCollection("1dag").find(
new Document("timestamp", new Document("$gte", 1412204098).append("$lte", 1412204099)));
The constructor new Document(key, value) only gets you a document with one key-value pair. But in this case you need to create a document with more than one. To do this, create an empty document, and then add pairs to it with .append(key, value).
Document timespan = new Document();
timespan.append("$gt", 1412204098);
timespan.append("$lt", 1412204998);
// timespan in JSON:
// { $gt: 1412204098, $lt: 1412204998}
Document condition = new Document("timestamp", timespan);
// condition in JSON:
// { timestamp: { $gt: 1412204098, $lt: 1412204998} }
FindIterable<Document> iterable = db.getCollection("1dag").find(condition);
Or if you really want to do it with a one-liner without temporary variables:
FindIterable<Document> iterable = db.getCollection("1dag").find(
new Document()
.append("timestamp", new Document()
.append("$gt",1412204098)
.append("$lt",1412204998)
)
);

how to retrive nested document and arrays values in elasticsearch

I have a following document
{
"_index" : "Testdb",
"_type" : "artWork",
"_id" : "0",
"_version" : 1,
"found" : true,
"_source":{"uuid":0,"ArtShare":{"TotalArtShares":0,"pricePerShare":0,"ArtworkUuid":12,"AvailableShares":0,"SoldShares":0},"StatusHistoryList":[{"ArtWorkDate":"2015-08-26T13:20:17.725+05:00","ArtworkStatus":"ACTIVE"}]}
}
i want to access/retrieve the value of ArtShare and its attributes and values of array StatusHistoryList
i am doing like this
val get=client.prepareGet("Testdb","artWork",Id.toString())
.setOperationThreaded(false)
.setFields("uuid","ArtShare","StatusHistoryList"
)
.execute()
.actionGet()
if(get.isExists())
{
uuid=get.getField("uuid").getValue.toString().toInt
//how to fetch `artShare` whole nested document and array elements `StatusHistoryListof`
}
UPDATE
if i do this
val get=client.prepareGet("Testdb","artWork",Id.toString())
.setOperationThreaded(false)
.setFields("uuid","ArtShare","StatusHistoryList"
,"_source","ArtShare.TotalArtShares")
.execute()
.actionGet()
if(get.isExists())
{
uuid=get.getField("uuid").getValue.toString().toInt
var totalShares= get.getField("ArtShare.TotalArtShares").getValue.toString().toInt
}
then following exception thrown
org.elasticsearch.ElasticsearchIllegalArgumentException: field [ArtShare] isn't a leaf field
at org.elasticsearch.index.get.ShardGetService.innerGetLoadFromStoredFields(ShardGetService.java:368)
at org.elasticsearch.index.get.ShardGetService.innerGet(ShardGetService.java:210)
at org.elasticsearch.index.get.ShardGetService.get(ShardGetService.java:104)
at org.elasticsearch.action.get.TransportGetAction.shardOperation(TransportGetAction.java:104)
at org.elasticsearch.action.get.TransportGetAction.shardOperation(TransportGetAction.java:44)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$ShardTransportHandler.messageReceived(TransportShardSingleOperationAction.java:297)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$ShardTransportHandler.messageReceived(TransportShardSingleOperationAction.java:280)
at org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:279)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
please guide me how to fetch these values
Yeah Actually the problem is that you have mentioned both "ArtShare" and "ArtShare.TotalArtShares" in the fields array. So it throws exception as you have already retrieved complete ArtShare object.
So please mention the fields that you want, If you want specified nested values then no need to access complete parent object.
Try this:
val get=client.prepareGet("Testdb","artWork",Id.toString())
.setOperationThreaded(false)
.setFields("uuid","StatusHistoryList",
"ArtShare.TotalArtShares")
.execute()
.actionGet()
if(get.isExists())
{
uuid=get.getField("uuid").getValue.toString().toInt
var totalShares= get.getField("ArtShare.TotalArtShares"
}
And if you want complete "ArtShare" object then simply write :
val get=client.prepareGet("Testdb","artWork",Id.toString())
.setOperationThreaded(false)
.setFields("uuid","ArtShare","StatusHistoryList"
)
.execute()
.actionGet()
if(get.isExists())
{
uuid=get.getField("uuid").getValue.toString().toInt
//how to fetch `artShare` whole nested document and array elements `StatusHistoryListof`
}

Categories