i have problem with ElasticSearch Java API. I use version 5.1.2.
I will now describe code pasted below. I need to optimize search mechanism by limiting inner_hits only to object id. I used InnerHitBuilder with .setFetchSourceContext(FetchSourceContext.DO_NOT_FETCH_SOURCE) and .addDocValueField("item.id"). Query being generated has error - there is "ignore_unmapped" attribute inside "inner_hits" node.
..."inner_hits": {
"name": "itemTerms",
"ignore_unmapped": false,
"from": 0,
"size": 2147483647,
"version": false,
"explain": false,
"track_scores": false,
"_source": false,
"docvalue_fields": ["item.id"]
}...
Executing such query results with error:
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "[inner_hits] unknown field [ignore_unmapped], parser not found"
}
],
"type": "illegal_argument_exception",
"reason": "[inner_hits] unknown field [ignore_unmapped], parser not found"
},
"status": 400
}
When i manually remove that attribute from query, everything runs smoothly.
protected BoolQueryBuilder itemTermQuery(FileTerms terms, boolean withInners) {
BoolQueryBuilder termsQuery = QueryBuilders.boolQuery();
for (String term : FileTerms.terms()) {
if (terms.term(term).isEmpty())
continue;
Set<String> fns = terms.term(term).stream().
map(x -> x.getTerm())
.filter(y -> !y.isEmpty())
.collect(Collectors.toSet());
if (!fns.isEmpty())
termsQuery = termsQuery.must(
QueryBuilders.termsQuery("item.terms." + term + ".term", fns));
}
QueryBuilder query = terms.notEmpty() ? termsQuery : QueryBuilders.matchAllQuery();
TermsQueryBuilder discontinuedQuery = QueryBuilders.termsQuery("item.terms." + FileTerms.Terms.USAGE_IS + ".term",
new FileTerm("Discontinued", "", "", "", "").getTerm());
FunctionScoreQueryBuilder.FilterFunctionBuilder[] functionBuilders = {
new FunctionScoreQueryBuilder.FilterFunctionBuilder(query, ScoreFunctionBuilders.weightFactorFunction(1)),
new FunctionScoreQueryBuilder.FilterFunctionBuilder(discontinuedQuery, ScoreFunctionBuilders.weightFactorFunction(-1000))
};
FunctionScoreQueryBuilder functionScoreQuery = functionScoreQuery(functionBuilders);
NestedQueryBuilder nested = QueryBuilders.nestedQuery("item", functionScoreQuery.query(), ScoreMode.None);
if (withInners) nested = nested.innerHit(new InnerHitBuilder()
.setFetchSourceContext(FetchSourceContext.DO_NOT_FETCH_SOURCE)
.addDocValueField("item.id")
.setSize(Integer.MAX_VALUE)
.setName("itemTerms"));
return QueryBuilders.boolQuery().must(nested);
}
How to build query without that unnecessary attribute inside "inner_hits" node?
EDIT:
I use 5.1.2 library and 5.1.2 elastic server.
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>transport</artifactId>
<version>5.1.2</version>
</dependency>
"version": {
"number": "5.1.2",
"build_hash": "c8c4c16",
"build_date": "2017-01-11T20:18:39.146Z",
"build_snapshot": false,
"lucene_version": "6.3.0"
},
Related
Suppose I have a JSON Object which needs to be updated in Mongodb like
{
"_id": 12345,
"Attribute": { "Property1": "X", "Property2": true, "Property3": 123 }
}
Suppose I have a record in mongoDb
{
"_id": 12345,
"Attribute1": "abc",
"Attribute2": "xyz",
"Attribute": { "Property4": "X", "Property2": false, "Property3": 456 }
}
The result should update Attribute JSON while updating only the fields that are changed and keeping rest of the values intact.
Resultant record in db should be like this
{
"_id": 12345,
"Attribute1": "abc",
"Attribute2": "xyz",
"Attribute": { "Property4": "X", "Property1": "X", "Property2": true, "Property3": 123 }
}
I really don't know how to achieve this in single Pass in Mongodb using JAVA spring boot. Can Anyone please help? Any help is appreciated.
You can use Update class from org.springframework.data.mongodb.core.query package. You can write a code snippet like below.
Update updateAttribute = new Update();
updateAttribute.push("Attribute", "Your Value");
mongoOperations.updateFirst(new Query(Criteria.where("id").is(12345)), updateAttribute, "yourCollection");
Also, you need to inject constructor for MongoOperations from org.springframework.data.mongodb.core package.
You can do it in 2 ways,
For Mongo version 4.4+ you can use pipelined updates, this allows the use of aggregation operators in the update body, specifically we'll want to use $mergeObjects, like so:
db.collection.update(
{
_id: 12345
},
[
{
$set: {
Attribute: {
$mergeObjects: [
"$Attribute",
{
"Property1": "X",
"Property2": true,
"Property3": 123
}
]
}
}
}
])
Mongo Playground
For lesser Mongo versions you'll have to construct the update body in code, here is a javascript example ( might be slightly more annoying in spring )
const input = {
'_id': 12345,
'Attribute': { 'Property1': 'X', 'Property2': true, 'Property3': 123 },
};
const updateBody = {};
Object.keys(input.Attribute).forEach((key) => {
const updateKey = `Attribute.${key}`;
updateBody[updateKey] = input.Attribute[key];
});
db.collection.updateOne({ _id: 12345 }, { $set: updateBody });
By using the dot notation in the update body we ensure we don't overwrite existing fields in the Attribute.
I achieved this using this query.
db.collection.update(
{
_id: 12345
},
{
$set: {
"Property1": "X",
"Property2": true,
"Property3": 123
}
},
{upsert: true}
)
i am trying out dynamodb locally and got the following table:
"Table": {
"AttributeDefinitions": [
{
"AttributeName": "hashKey",
"AttributeType": "S"
},
{
"AttributeName": "sortKey",
"AttributeType": "S"
},
{
"AttributeName": "full_json",
"AttributeType": "S"
}
],
"TableName": "local",
"KeySchema": [
{
"AttributeName": "hashKey",
"KeyType": "HASH"
},
{
"AttributeName": "sortKey",
"KeyType": "RANGE"
}
],
"TableStatus": "ACTIVE",
"CreationDateTime": "2021-10-01T15:18:04.413000+02:00",
"ProvisionedThroughput": {
"LastIncreaseDateTime": "1970-01-01T01:00:00+01:00",
"LastDecreaseDateTime": "1970-01-01T01:00:00+01:00",
"NumberOfDecreasesToday": 0,
"ReadCapacityUnits": 5,
"WriteCapacityUnits": 1
},
"TableSizeBytes": 1066813,
"ItemCount": 23,
"TableArn": "arn:aws:dynamodb:ddblocal:000000000000:table/local",
"GlobalSecondaryIndexes": [
{
"IndexName": "sortKeyIndex",
"KeySchema": [
{
"AttributeName": "sortKey",
"KeyType": "HASH"
}
],
"Projection": {
"ProjectionType": "ALL"
},
"IndexStatus": "ACTIVE",
"ProvisionedThroughput": {
"ReadCapacityUnits": 10,
"WriteCapacityUnits": 1
},
"IndexSizeBytes": 1066813,
"ItemCount": 23,
"IndexArn": "arn:aws:dynamodb:ddblocal:000000000000:table/local/index/sortKeyIndex"
}
]
}
I want to query it with Java like this:
Index index = table.getIndex("sortKeyIndex");
ItemCollection<QueryOutcome> items2 = null;
QuerySpec querySpec = new QuerySpec();
querySpec.withKeyConditionExpression("sortKey > :end_date")
.withValueMap(new ValueMap().withString(":end_date","2021-06-30T07:49:22.000Z"));
items2 = index.query(querySpec);
But it throws a Exception with "QUery Key Condition not supported". I dont understand this, because in the docs, the "<" operator is described as regular operation. Can anybody help me
DDB Query() requires a key condition that includes an equality check on the hash/partition key.
You must provide the name of the partition key attribute and a single
value for that attribute. Query returns all items with that partition
key value. Optionally, you can provide a sort key attribute and use a
comparison operator to refine the search results.
In other words, the only time you can really use Query() is when you have a composite primary key (hash + sort).
Without a sort key specified as part of the key for the table/GSI, Query() acts just like GetItem() returning a single record with the given hash key.
I have indexed sample documents in elasticsearch and trying to search using fuzzy query. But am not getting any results when am search by using Java fuzzy query api.
Please find my below mapping script :
PUT productcatalog
{
"settings": {
"analysis": {
"analyzer": {
"attr_analyzer": {
"type": "custom",
"tokenizer": "letter",
"char_filter": [
"html_strip"
],
"filter": ["lowercase", "asciifolding", "stemmer_minimal_english"]
}
},
"filter" : {
"stemmer_minimal_english" : {
"type" : "stemmer",
"name" : "minimal_english"
}
}
}
},
"mappings": {
"doc": {
"properties": {
"values": {
"type": "text",
"analyzer": "attr_analyzer"
},
"catalog_type": {
"type": "text"
},
"catalog_id":{
"type": "long"
}
}
}
}
}
Please find my sample data.
PUT productcatalog/doc/1
{
"catalog_id" : "343",
"catalog_type" : "series",
"values" : "Activa Rooftop, valves, VG3000, VG3000FS, butterfly, ball"
}
PUT productcatalog/doc/2
{
"catalog_id" : "12717",
"catalog_type" : "product",
"values" : "Activa Rooftop, valves"
}
Please find my search script :
GET productcatalog/_search
{
"query": {
"match" : {
"values" : {
"query" : " activa rooftop VG3000",
"operator" : "and",
"boost": 1.0,
"fuzziness": 2,
"prefix_length": 0,
"max_expansions": 100
}
}
}
}
Am getting the below results for the above query :
{
"took": 239,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.970927,
"hits": [
{
"_index": "productcatalog",
"_type": "doc",
"_id": "1",
"_score": 0.970927,
"_source": {
"catalog_id": "343",
"catalog_type": "series",
"values": "Activa Rooftop, valves, VG3000, VG3000FS, butterfly, ball"
}
}
]
}
}
But if i use the below Java API for the same fuzzy search am not getting any results out of it.
Please find my below Java API query for fuzzy search :
QueryBuilder qb = QueryBuilders.boolQuery()
.must(QueryBuilders.fuzzyQuery("values", keyword).boost(1.0f).prefixLength(0).maxExpansions(100));
Update 1
I have tried with the below query
QueryBuilder qb = QueryBuilders.matchQuery(QueryBuilders.fuzzyQuery("values", keyword).boost(1.0f).prefixLength(0).maxExpansions(100));
But am not able to pass QueryBuilders inside matchQuery. Am getting this suggestion while am writing this query The method matchQuery(String, Object) in the type QueryBuilders is not applicable for the arguments (FuzzyQueryBuilder)
The mentioned java query is not a match query. It's a must query. you should use matchQuery instead of boolQuery().must(QueryBuilders.fuzzyQuery())
Update 1:
fuzzy query is a term query while match query is a full text query.
Also don't forget that in match query the default Operator is or operator which you should change it to and like your dsl query.
I am new to programming and I want to get the sum of power used in a month from a data stored in elasticsearch, I've used sense and got the value but still finding it hard using Java API in scala. This is what I did
POST /myIndext/myType/_search?search_type=dfs_query_then_fetch
{
"aggs": {
"duration": {
"date_histogram": {
"field": "Day",
"interval": "month",
"format": "yyyy-MM-dd"},
"aggs": {
"Power_total": {
"sum": {
"field": "myField"
}
}
}
}
}
}
RESULT WAS
( "aggregations": {
"duration": {
"buckets": [
{
"key_as_string": "2017-01-01",
"key": 1480550400000,
"doc_count": 619,
"myField": {
"value": 5218.066633789334
}
}
Then scala code is this
val matchquery = QueryBuilders.matchQuery("ID", configurate)
val queryK = QueryBuilders.matchQuery("ID", configurate)
val filterA = QueryBuilders.rangeQuery("Day").gte("2017-01-02T00:00:05.383+0100").lte("2017-01-13T00:00:05.383+0100")
val query = QueryBuilders.filteredQuery(queryK, filteAr)
val agg = AggregationBuilders.dateHistogram("duration")
.field("Day")
.interval(DateHistogramInterval.MONTH)
.minDocCount(0)
.extendedBounds(new DateTime("2017-01-01T00:00:05.383+0100"), new DateTime("2017-01-13T00:00:05.383+0100"))
.subAggregation(AggregationBuilders.sum("power_total").field("myField"))
val result: SearchResponse = client
.prepareSearch("myIndex")
.setTypes("myType")
.setSearchType(SearchType.DFS_QUERY_THEN_FETCH)
.setQuery(query)
.addAggregation(agg)
.addSort("Day", SortOrder.DESC)
.setSize(815)
.addField("myField")
.execute()
.actionGet()
val results = result.getHits.getHits
println("Current results: " + results.length)
for (hit <- results) {
println("------------------------------")
val response = hit.getSource
println(response)
}
client.close()
RESULT WAS
current result = 0
Please let me know why am not getting value for "myField" like I got using sense.
I have tried doing it severally and still get same errors, could it be that I don't parse the query response the right way?
Everything was correct the only pitfall was that I was querying a date time not stored stored in my database. so instead of "2017-01-01", I was inserting this "2017-01-02"
Intention:
Elasticsearch Java MoreLikeThis query in Java to do exactly what the below raw more_like_this filtered query via the /_search rest endpoint is doing.
GET /index/type/_search
{
"query": {
"filtered": {
"query": {
"more_like_this": {
"fields": [
"title",
"body",
"description",
"organisations",
"locations"
],
"min_term_freq": 2,
"max_query_terms": 25,
"ids": [
"http://xxx/doc/doc"
]
}
},
"filter": {
"range": {
"datePublished": {
"gte": "2016-01-01T12:30:00+01:00"
}
}
}
}
},
"fields": [
"title",
"description",
"datePublished"
]
}
And this is my Java implementation for the above:
FilteredQueryBuilder queryBuilder = new FilteredQueryBuilder(QueryBuilders.matchAllQuery(),FilterBuilders.rangeFilter("datePublished").gte(("2016-01-01T12:30:00+01:00")));
SearchSourceBuilder query = SearchSourceBuilder.searchSource().query(queryBuilder);
return client.prepareMoreLikeThis("index", "type", "http://xxx/doc/doc")
.setField("title", "description", "body", "organisations","locations")
.setMinTermFreq(2)
.maxQueryTerms(25)
.setSearchSource(query);
However, the results far differ from the more_like_this rest endpoint was returning. I am getting matches of about 4/5th of my whole documents in the index. As if none of the filters are being applied
Targeting ES v1.4.2 and v1.6.2
Any advice please.Thanks
I got the desire results with QueryBuilders.moreLikeThisQuery(). Inspirations from this post here.
FilterBuilder filterBuilder = FilterBuilders.rangeFilter("datePublished")
.gte("2016-01-01T12:30:00+01:00")
.includeLower(false).includeUpper(false);
MoreLikeThisQueryBuilder mltQueryBuilder = QueryBuilders.moreLikeThisQuery("title", "description", "body", "organisations","locations")
.minTermFreq(2)
.maxQueryTerms(25)
.ids("http://xxx/doc/doc");
SearchRequestBuilder searchRequestBuilder = client.prepareSearch("index");
searchRequestBuilder.setTypes("type");
searchRequestBuilder.addFields("title","description","datePublished");
searchRequestBuilder.setQuery(mltQueryBuilder).setPostFilter(filterBuilder);
searchRequestBuilder.execute().actionGet()
Notes:
QueryBuilders seems to be the way forward in terms of compatibility with ES v2.0 and beyound
#MoreLikeThisRequestBuilder will be deprecated in ES v1.6 + and removed in 2.0