Spring data elasticsearch bulk index on Percolator - java

Using spring data elasticsearch I want to do the following (bulk Indexing)
Step 1
PUT surname
{
"settings" : {
"index" : {
"number_of_shards" : 1,
"number_of_replicas" : 0
}
},
"mappings": {
"properties": {
"message": {
"type": "text"
},
"query": {
"type": "percolator"
}
}
}
}
Step 2:
PUT surname/_bulk
{ "index" : { "_index" : "surname", "_type" : "_doc", "_id" : "1" } }
{ "query" : { "match_phrase" : { "message" : "Aal" } }}
{ "index" : { "_index" : "surname", "_type" : "_doc", "_id" : "2" } }
{ "query" : { "match_phrase" : { "message" : "Aalbers" } }}
Step 1 is done with the help of - https://stackoverflow.com/a/67724048/4068218
For Step 2 I tried
IndexQuery and UpdateQuery
Query query = new CriteriaQuery(new Criteria("match_phrase").subCriteria(new Criteria("message").is("Aal")));
UpdateQuery updateQuery = builder(query).build();
elasticsearchOperations.bulkUpdate(ImmutableList.of(updateQuery),IndexCoordinates.of(indexName.get()));
but both do not work. If I use UpdateQuery I get Validation Failed: 1: id is missing;2: script or doc is missing.
If I use IndexQuery I get malformed query.
How do I go about Step 2 in spring data elasticsearch? Any lead is much appreciated.

Below code worked for me. Sharing here as it might help someone
IndexQuery indexQuery = new IndexQueryBuilder()
.withId("12")
.withSource(String.format("{ \"query\" : { \"match_phrase\" : { \"message\" : \"%s\" } }}", "Aal"))
.build();
elasticsearchOperations.bulkIndex(ImmutableList.of(indexQuery),IndexCoordinates.of(indexName.get()));

Related

Elastic search term query not working on a specific field

I'm new to elastic search.
So this is how the index looks:
{
"scresults-000001" : {
"aliases" : {
"scresults" : { }
},
"mappings" : {
"properties" : {
"callType" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"code" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"data" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"esdtValues" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"gasLimit" : {
"type" : "long"
},
AND MORE OTHER Fields.......
If I'm trying to create a search query in Java that looks like this:
{
"bool" : {
"filter" : [
{
"term" : {
"sender" : {
"value" : "sendervalue",
"boost" : 1.0
}
}
},
{
"term" : {
"data" : {
"value" : "YWRkTGlxdWlkaXR5UHJveHlAMDAwMDAwMDAwMDAwMDAwMDA1MDBlYmQzMDRjMmYzNGE2YjNmNmE1N2MxMzNhYjdiOGM2ZjgxZGM0MDE1NTQ4M0A3ZjE1YjEwODdmMjUwNzQ4QDBjMDU0YjcwNDhlMmY5NTE1ZWE3YWU=",
"boost" : 1.0
}
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
}
If I run this query I get 0 hits. If I change the field "data" with other field it works. I don't understand what's different.
How I actually create the query in Java+SpringBoot:
QueryBuilder boolQuery = QueryBuilders.boolQuery()
.filter(QueryBuilders.termQuery("sender", "sendervalue"))
.filter(QueryBuilders.termQuery("data",
"YWRkTGlxdWlkaXR5UHJveHlAMDAwMDAwMDAwMDAwMDAwMDA1MDBlYmQzMDRjMmYzNGE2YjNmNmE1N2MxMzNhYjdiOGM2ZjgxZGM0MDE1NTQ4M0A3ZjE1YjEwODdmMjUwNzQ4QDBjMDU0YjcwNDhlMmY5NTE1ZWE3YWU="));
Query searchQuery = new NativeSearchQueryBuilder()
.withFilter(boolQuery)
.build();
SearchHits<ScResults> articles = elasticsearchTemplate.search(searchQuery, ScResults.class);
Since you're trying to do an exact match on a string with a term query, you need to do it on the data.keyword field which is not analyzed. Since the data field is a text field, hence analyzed by the standard analyzer, not only are all letters lowercased but the = sign at the end also gets stripped off, so there's no way this can match (unless you use a match query on the data field but then you'd not do exact matching anymore).
POST _analyze
{
"analyzer": "standard",
"text": "YWRkTGlxdWlkaXR5UHJveHlAMDAwMDAwMDAwMDAwMDAwMDA1MDBlYmQzMDRjMmYzNGE2YjNmNmE1N2MxMzNhYjdiOGM2ZjgxZGM0MDE1NTQ4M0A3ZjE1YjEwODdmMjUwNzQ4QDBjMDU0YjcwNDhlMmY5NTE1ZWE3YWU="
}
Results:
{
"tokens" : [
{
"token" : "ywrktglxdwlkaxr5uhjvehlamdawmdawmdawmdawmdawmda1mdblymqzmdrjmmyznge2yjnmnme1n2mxmznhyjdiogm2zjgxzgm0mde1ntq4m0a3zje1yjewoddmmjuwnzq4qdbjmdu0yjcwndhlmmy5nte1zwe3ywu",
"start_offset" : 0,
"end_offset" : 163,
"type" : "<ALPHANUM>",
"position" : 0
}
]
}

Elastic termsQuery not giving expected result

I have an index where each of my objects has status field which can have some predefined values. I want to fetch all of them which has statusINITIATED, UPDATED, DELETED, any match with these and hence created this query by java which I got printing on console, using Querybuilder and nativeSearchQuery, executing by ElasticsearchOperations:
{
"bool" : {
"must" : [
{
"terms" : {
"status" : [
"INITIATED",
"UPDATED",
"DELETED"
],
"boost" : 1.0
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
}
I have data in my index with 'INITIATED' status but not getting anyone with status mentioned in the query. How to fix this query, please?
If you need anything, please let me know.
Update: code added
NativeSearchQueryBuilder nativeSearchQueryBuilder=new NativeSearchQueryBuilder();
QueryBuildersingleQb=QueryBuilders.boolQuery().must(QueryBuilders.termsQuery("status",statusList));
Pageable pageable = PageRequest.of(0, 1, Sort.by(Defs.START_TIME).ascending());
FieldSortBuilder sort = SortBuilders.fieldSort(Defs.START_TIME).order(SortOrder.ASC);
nativeSearchQueryBuilder.withQuery(singleQb);
nativeSearchQueryBuilder.withSort(sort);
nativeSearchQueryBuilder.withPageable(pageable);
nativeSearchQueryBuilder.withIndices(Defs.SCHEDULED_MEETING_INDEX);
nativeSearchQueryBuilder.withTypes(Defs.SCHEDULED_MEETING_INDEX);
NativeSearchQuery searchQuery = nativeSearchQueryBuilder.build();
List<ScheduledMeetingEntity> scheduledList=elasticsearchTemplate.queryForList(searchQuery, ScheduledMeetingEntity.class);
Update 2: sample data:
I got this from kibana query on this index:
"hits" : [
{
"_index" : "index_name",
"_type" : "type_name",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"createTime" : "2021-03-03T13:09:59.198",
"createTimeInMs" : 1614755399198,
"createdBy" : "user1#domain.com",
"editTime" : "2021-03-03T13:09:59.198",
"editTimeInMs" : 1614755399198,
"editedBy" : "user1#domain.com",
"versionId" : 1,
"id" : "1",
"meetingId" : "47",
"userId" : "129",
"username" : "user1#domain.com",
"recipient" : [
"user1#domain.com"
],
"subject" : "subject",
"body" : "hi there",
"startTime" : "2021-03-04T07:26:00.000",
"endTime" : "2021-03-04T07:30:00.000",
"meetingName" : "name123",
"meetingPlace" : "placeName",
"description" : "sfsafsdafsdf",
"projectName" : "",
"status" : "INITIATED",
"failTry" : 0
}
}
]
Confirm your mapping:
GET /yourIndexName/_mapping
And see if it is valid
Your mapping needs to have keyword for TermsQuery to work.
{
"status": {
"type" "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
ES can automatically do the mapping for you (without you having to do it yourself) when you first push a document. However you probably have finer control if you do the mapping yourself.
Either way, you need to have keyword defined for your status field.
=====================
Alternative Solution: (Case Insensitive)
If you have a Field named (status), and the values you want to search for are (INITIATED or UPDATED, or DELETED).
Then you can do it like this:
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery()
.must(createStringSearchQuery());
public QueryBuilder createStringSearchQuery(){
QueryStringQueryBuilder queryBuilder = QueryBuilders.queryStringQuery(" INITIATED OR UPDATED OR DELETED ");
queryBuilder.defaultField("status");
return queryBuilder;
}
Printing the QueryBuilder:
{
"query_string" : {
"query" : "INITIATED OR UPDATED OR DELETED",
"default_field" : "status",
"fields" : [ ],
"type" : "best_fields",
"default_operator" : "or",
"max_determinized_states" : 10000,
"enable_position_increments" : true,
"fuzziness" : "AUTO",
"fuzzy_prefix_length" : 0,
"fuzzy_max_expansions" : 50,
"phrase_slop" : 0,
"escape" : false,
"auto_generate_synonyms_phrase_query" : true,
"fuzzy_transpositions" : true,
"boost" : 1.0
}
}

Spring MongoDB Aggregation Group - Can't get the group query right

I have a document in a MongoDB, which looks like follows.
{
"_id" : ObjectId("5ceb812b3ec6d22cb94c82ca"),
"key" : "KEYCODE001",
"values" : [
{
"classId" : "CLASS_01",
"objects" : [
{
"code" : "DD0001"
},
{
"code" : "DD0010"
}
]
},
{
"classId" : "CLASS_02",
"objects" : [
{
"code" : "AD0001"
}
]
}
]
}
I am interested in getting a result like follows.
{
"classId" : "CLASS_01",
"objects" : [
{
"code" : "DD0001"
},
{
"code" : "DD0010"
}
]
}
To get this, I came up with an aggregation pipeline in Robo 3T, which looks like follows. And it's working as expected.
[
{
$match:{
'key':'KEYCODE001'
}
},
{
"$unwind":{
"path": "$values",
"preserveNullAndEmptyArrays": true
}
},
{
"$unwind":{
"path": "$values.objects",
"preserveNullAndEmptyArrays": true
}
},
{
$match:{
'values.classId':'CLASS_01'
}
},
{
$project:{
'object':'$values.objects',
'classId':'$values.classId'
}
},
{
$group:{
'_id':'$classId',
'objects':{
$push:'$object'
}
}
},
{
$project:{
'_id':0,
'classId':'$_id',
'objects':'$$objects'
}
}
]
Now, when I try to do the same in a SpringBoot application, I can't get it running. I ended up having the error java.lang.IllegalArgumentException: Invalid reference '$complication'!. Following is what I have done in Java so far.
final Aggregation aggregation = newAggregation(
match(Criteria.where("key").is("KEYCODE001")),
unwind("$values", true),
unwind("$values.objects", true),
match(Criteria.where("classId").is("CLASS_01")),
project().and("$values.classId").as("classId").and("$values.objects").as("object"),
group("classId", "objects").push("$object").as("objects").first("$classId").as("_id"),
project().and("$_id").as("classId").and("$objects").as("objects")
);
What am I doing wrong? Upon research, I found that multiple fields in group does not work or something like that (please refer to this question). So, is what I am currently doing even possible in Spring Boot?
After hours of debugging + trial and error, found the following solution to be working.
final Aggregation aggregation = newAggregation(
match(Criteria.where("key").is("KEYCODE001")),
unwind("values", true),
unwind("values.objects", true),
match(Criteria.where("values.classId").is("CLASS_01")),
project().and("values.classId").as("classId").and("values.objects").as("object"),
group(Fields.from(Fields.field("_id", "classId"))).push("object").as("objects"),
project().and("_id").as("classId").and("objects").as("objects")
);
It all boils down to group(Fields.from(Fields.field("_id", "classId"))).push("object").as("objects") that which introduces a org.springframework.data.mongodb.core.aggregation.Fields object that wraps a list of org.springframework.data.mongodb.core.aggregation.Field objects. Within Field, the name of the field and the target could be encapsulated. This resulted in the following pipeline which is a match for the expected.
[
{
"$match" :{
"key" : "KEYCODE001"
}
},
{
"$unwind" :{
"path" : "$values", "preserveNullAndEmptyArrays" : true
}
},
{
"$unwind" :{
"path" : "$values.objects", "preserveNullAndEmptyArrays" : true
}
},
{
"$match" :{
"values.classId" : "CLASS_01"
}
},
{
"$project" :{
"classId" : "$values.classId", "object" : "$values.objects"
}
},
{
"$group" :{
"_id" : "$classId",
"objects" :{
"$push" : "$object"
}
}
},
{
"$project" :{
"classId" : "$_id", "objects" : 1
}
}
]
Additionally, figured that there is no need to using $ sign anywhere and everywhere.

Why my java elasticsearch request translation is not valid?

I'm currently making an elasticsearch request to retrieves some data. I have succeeded to write the right request in Json format. After that I tried to translate this one into Java. But when I print the request that the Java sends to ES, both requests are not the same and I don't achieve to make that.
Here is the Json request that returns the GOOD data:
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{ "terms": { "accountId": ["107276450147"] } },
{"range" : {
"date" : {
"lt" : "1480612801000",
"gte" : "1478020801000"
} }
}]
}
}
}
},
"size" : 0,
"aggregations" : {
"field-aggregation" : {
"terms" : {
"field" : "publicationId",
"size" : 2147483647
},
"aggregations" : {
"top-aggregation" : {
"top_hits" : {
"size" : 1,
"_source" : {
"includes" : [ ],
"excludes" : [ ]
}
}
}
}
}
}
}
And the Java generated request... which does not return good data..
{
"from" : 0,
"size" : 10,
"aggregations" : {
"field-aggregation" : {
"terms" : {
"field" : "publicationId",
"size" : 2147483647
},
"aggregations" : {
"top-aggregation" : {
"top_hits" : {
"size" : 1,
"_source" : {
"includes" : [ ],
"excludes" : [ ]
}
}
}
}
}
}
}
And finally the java code that generate the wrong json request:
TopHitsBuilder top = AggregationBuilders.topHits("top-aggregation")
.setFetchSource(true)
.setSize(1);
TermsBuilder field = AggregationBuilders.terms("field-aggregation")
.field(aggFieldName)
.size(Integer.MAX_VALUE)
.subAggregation(top);
BoolFilterBuilder filterBuilder = FilterBuilders.boolFilter()
.must(FilterBuilders.termsFilter("accountId", Collections.singletonList("107276450147")))
.must(FilterBuilders.rangeFilter("date").gte(1478020801000L).lte(1480612801000L));
NativeSearchQueryBuilder query = new NativeSearchQueryBuilder()
.withQuery(QueryBuilders.filteredQuery(QueryBuilders.matchAllQuery(), filterBuilder))
.withIndices("metric")
.withTypes(type)
.addAggregation(field);
return template.query(query.build());
First of all, I must remove the "size":10 and the "from" that the Java generates... And after I have to add the filters. I did this but it's never added..
Can you tell what is wrong in my java code and why the filters does not appears in the final Json?
Thanks guys.
Thanks guys. I finally manage the problem. The java sent the good query but I was looked at the wrong place in ES java API. Nevertheless, I added the request type to COUNT in order to avoid ES send me back the non-aggregated data that are useless for me.

How to generate below query using elasticsearch java api

I want to generate similar query using Elasticsearch JAVA API . I am trying to apply filters at an aggregation level.
{
"query":{
"filtered":{
"filter":{ "terms":{ "family_name":"Brown" } } //filter_1
}
},
"aggs":{
"young_age":{
"filter":{
"terms" : {
"gender" : "male" //filter_2
}
},
"aggs":{
"age":{
"terms":{
"field":"age"
}
}
}
}
}
}
Please find the sample code that i am up to
TermFilterBuilder family_filter_1 = FilterBuilders.termFilter("family_name","Brown");
FilteredQueryBuilder qbuilder =QueryBuilders.filteredQuery(QueryBuilders.matchAllQuery(),family_filter_1);
SearchRequestBuilder search = client.prepareSearch("test_index")
.setTypes("test_type")
.setSearchType(SearchType.COUNT)
.setQuery(qbuilder);
search.addAggregation(terms("age").field("age")
.size(0)// Size 0 returns all the "group by keys"
.order(Terms.Order.count(true))); // to sort the output
System.out.println(""+search);
and response i am getting . Please suggest how to add filter_2
{
"query" : {
"filtered" : {
"query" : {
"match_all" : { }
},
"filter" : {
"term" : {
"family_name" : "Brown"
}
}
}
},
"aggregations" : {
"age" : {
"terms" : {
"field" : "age",
"size" : 0,
"order" : {
"_count" : "asc"
}
}
}
}
}
Thanks in advance..
You can do it like this:
// same as your code
TermFilterBuilder family_filter_1 = ...;
FilteredQueryBuilder qbuilder = ...;
SearchRequestBuilder search = ...;
// build the range filter
RangeQuery ageRange = QueryBuilders.rangeQuery("age")
.from(18).to(40).includeLower(false).includeUpper(false);
// build the terms sub-aggregation
TermsAggregation age = AggregationBuilders.terms("age")
.field("age")
.size(0)
.order(Terms.Order.count(true));
// build the filter top-aggregation
FilterAggregationBuilder youngAge = AggregationBuilders
.filter("young_age")
.filter(ageRange)
.subAggregation(age);
search.addAggregation(youngAge);

Categories