how to disable es highlight the synonym? - java

I only want to highlight the words I search in the query, not including the synonym, but I also hope es can return the search result can contain the synonym search result, here is an example.
PUT /my_test_index/
{
"settings": {
"analysis": {
"filter": {
"native_synonym": {
"type": "synonym",
"ignore_case": true,
"expand": true,
"synonyms": [
"apple,fruit"
]
}
},
"analyzer": {
"test_analyzer": {
"tokenizer": "whitespace",
"filter": [
"native_synonym"
]
}
}
}
},
"mappings": {
"properties": {
"desc": {
"type": "text",
"analyzer": "test_analyzer"
}
}
}
}
POST /my_test_index/_doc
{
"desc": "apple"
}
POST /my_test_index/_doc
{
"desc": "fruit"
}
GET /my_test_index/_search
{
"query": {
"match": {
"desc": "apple"
}
},
"highlight": {
"fields": {
"desc": {}
}
}
}
However, es highlight both fruit and apple while I only want the apple get highlight.
Anyone knows how to solve this? Thanks in advance :)
"hits": [
{
"_index": "my_test_index",
"_type": "_doc",
"_id": "RMyZrXAB7JsJEwsbVF33",
"_score": 0.29171452,
"_source": {
"desc": "apple"
},
"highlight": {
"desc": [
"<em>apple</em>"
]
}
},
{
"_index": "my_test_index",
"_type": "_doc",
"_id": "RcyarXAB7JsJEwsboF2V",
"_score": 0.29171452,
"_source": {
"desc": "fruit"
},
"highlight": {
"desc": [
"<em>fruit</em>"
]
}
}
]

You can add a highlight query that behaves different to your actual search query. All you need then is a field indexed without the synonyms, and you should be able to get what you want:
PUT /my_test_index/
{
"settings": {
"analysis": {
"filter": {
"native_synonym": {
"type": "synonym",
"ignore_case": true,
"expand": true,
"synonyms": [
"apple,fruit"
]
}
},
"analyzer": {
"test_analyzer": {
"tokenizer": "whitespace",
"filter": [
"native_synonym"
]
}
}
}
},
"mappings": {
"properties": {
"desc": {
"type": "text",
"analyzer": "test_analyzer",
"fields": {
"raw": {
"type": "text",
"analyzer": "whitespace"
}
}
}
}
}
}
GET /my_test_index/_search
{
"query": {
"match": {
"desc": "apple"
}
},
"highlight": {
"fields": {
"desc.raw": {
"highlight_query": {
"match": {
"desc.raw": "apple"
}
}
}
}
}
}

Related

Search documents with property (array) does not contain any elements from other array

I have a problem. I have a document:
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0.10536051,
"hits": [
{
"_index": ...,
"_type": "_doc",
"_id": ...,
"_score": 0.10536051,
"_source": {
...
"testProperty": ["asd-asd", "sdf-sdf"]
}
}
]
}
}
I need to build a query to find documents where testProperty doesn't contain any element from the array I give.
I tried something like
{
"query":{
"bool":{
"must": {
...
},
"must_not":[
...
{
"terms": {
"testProperty": [
"qwe-qwe",
"asd-asd"
]
}
}
]
}
}
}
and it doesn't work. Do you have any idea how to do this?
Adding a working example
Index Mapping:
{
"mappings": {
"properties": {
"testProperty": {
"type": "keyword"
}
}
}
}
Index Data:
{
"testProperty": "sdf-sdf"
}
{
"testProperty": "asd-asd"
}
Search Query:
{
"query": {
"bool": {
"must_not": {
"terms": {
"testProperty": [
"qwe-qwe",
"asd-asd"
]
}
}
}
}
}
Search Result:
"hits": [
{
"_index": "66195355",
"_type": "_doc",
"_id": "2",
"_score": 0.0,
"_source": {
"testProperty": "sdf-sdf"
}
}
]

how to use terms query in elasticsearch 7.x in this case

elasticsearch version is 7.x
here has some nested data blow :
data1:
[{name:"tom"},{name:"jack"}]
data2:
[{name:"tom"},{name:"rose"}]
data3:
[{name:"tom"},{name:"rose3"}]
...
dataN:
[{name:"tom"},{name:"roseN"}]
when i use the terms query , I just want to search tom, jack, But don't want to include rose...roseN
query:{
terms:{["tom","jack"]}
}
this code is not effective
Adding a working example
Index Data:
PUT /_doc/1
{
"names": [
{
"name": "tom"
},
{
"name": "jack"
}
]
}
PUT /_doc/2
{
"names": [
{
"name": "tom"
},
{
"name": "rose"
}
]
}
Search Query:
{
"query": {
"bool": {
"must": {
"terms": {
"names.name": [
"tom",
"jack"
]
}
},
"must_not": {
"match": {
"names.name": "rose"
}
}
}
}
}
Search Result:
"hits": [
{
"_index": "65838516",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"names": [
{
"name": "tom"
},
{
"name": "jack"
}
]
}
}
]

Can we sort the data in one single document if the object has list of objects inside in Elastic search. I should also sort nested documents

This is my query, where it does the nested sort but I want it to sort the data in item_numbers array together with the nested sort in a single query in elastic search.
{
"query": {
"nested": {
"query": {
"bool": {
"must": [{
"match": {
"item_numbers.type": "catalog"
}
}]
}
},
"path": "item_numbers"
}
},
"sort": [{
"item_numbers.value.keyword": {
"order": "asc",
"nested": {
"path": "item_numbers"
}
}
}]
}
My output for the above query is below :
{
"data": [
{
"item_numbers": [
{
"value": "Ball",
"value_phonetic": "",
"type": "catalog"
},
{
"value": "Apple",
"value_phonetic": "",
"type": "catalog"
},
{
"value": "Cat",
"value_phonetic": "",
"type": "catalog"
}
]
},
{
"item_numbers": [
{
"value": "Cococola",
"value_phonetic": "",
"type": "catalog"
},
{
"value": "Appy",
"value_phonetic": "",
"type": "catalog"
}
]
}
]
}
But I want to sort the document which contains multiple data in an array in a single document
Expected output :
{
"data": [
{
"item_numbers": [
{
"value": "Apple",
"value_phonetic": "",
"type": "catalog"
},
{
"value": "Ball",
"value_phonetic": "",
"type": "catalog"
},
{
"value": "Cat",
"value_phonetic": "",
"type": "catalog"
}
]
},
{
"item_numbers": [
{
"value": "Appy",
"value_phonetic": "",
"type": "catalog"
},
{
"value": "Cococola",
"value_phonetic": "",
"type": "catalog"
}
]
}
]
}
Does anyone know what changes to be made in the query to sort to get this output?
The global sort, even though it's nested, is only applied on the top level -- meaning the inner docs don't get sorted.
What you're looking for is sorted inner_hits:
{
"_source": "sorted_item_numbers", <--
"query": {
"nested": {
"query": {
"bool": {
"must": [
{
"match": {
"item_numbers.type": "catalog"
}
}
]
}
},
"inner_hits": { <--
"name": "sorted_item_numbers",
"sort": {
"item_numbers.value.keyword": "asc"
}
},
"path": "item_numbers"
}
},
"sort": [
{
"item_numbers.value.keyword": {
"order": "asc",
"nested": {
"path": "item_numbers"
}
}
}
]
}
Note that the response will be slightly different from the standard hits but both the top-level docs will be sorted (the doc with the best item_numbers.value taking precedence) as well as the actual contents of the item_numbers.

Elastic Search, creating index with sources and settings by Rest Java Client

I'm trying to create index following this guide: https://www.elastic.co/guide/en/elasticsearch/client/java-rest/master/java-rest-high-create-index.html#_providing_the_whole_source
The problem is, that index is not created properly. Looks like whole settings section as well as completion type is ignored.
My json file:
{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1,
"analysis": {
"filter": {},
"analyzer": {
"keyword_analyzer": {
"filter": [
"lowercase",
"asciifolding",
"trim"
],
"char_filter": [],
"type": "custom",
"tokenizer": "keyword"
}
}
}
},
"mappings": {
"my_type": {
"properties": {
"first": {
"type": "text",
"fields": {
"keywordstring": {
"type": "text",
"analyzer": "keyword_analyzer"
},
"completion": {
"type": "completion"
}
},
"analyzer": "standard"
},
"second": {
"type": "text",
"fields": {
"keywordstring": {
"type": "text",
"analyzer": "keyword_analyzer"
},
"completion": {
"type": "completion"
}
},
"analyzer": "standard"
},
"third": {
"type": "text",
"fields": {
"keywordstring": {
"type": "text",
"analyzer": "keyword_analyzer"
},
"completion": {
"type": "completion"
}
},
"analyzer": "standard"
},
"fourth": {
"type": "text",
"fields": {
"keywordstring": {
"type": "text",
"analyzer": "keyword_analyzer"
},
"completion": {
"type": "completion"
}
},
"analyzer": "standard"
}
}
}
}
}
Java code:
CreateIndexRequest indexRequest = new CreateIndexRequest(ESClientConfiguration.INDEX_NAME);
URL url = Resources.getResource(TERYT_INDEX_CONFIGURATION_FILE_NAME);
return Try.of(() -> Resources.toString(url, Charsets.UTF_8))
.map(jsonIndexConfiguration -> indexRequest.source(jsonIndexConfiguration, XContentType.JSON))
.get();
createIndexRequest.setTimeout(TimeValue.timeValueMinutes(2));
Try.of(() -> client.indices().create(createIndexRequest, RequestOptions.DEFAULT))...
Index is created but when I look into Index Metadata, it looks completly wrong:
{
"state": "open",
"settings": {
"index": {
"creation_date": "1556379012380",
"number_of_shards": "1",
"number_of_replicas": "1",
"uuid": "L5fmkrjeQ6eKmuDyZ3MP3g",
"version": {
"created": "7000099"
},
"provided_name": "my_index"
}
},
"mappings": {
"my_type": {
"properties": {
"first": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"second": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"third": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"fourth": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
}
}
}
},
"aliases": [],
"primary_terms": {
"0": 1
},
"in_sync_allocations": {
"0": [
"Cx6tBeohR8mzbTO74dwsCw",
"FcTUhpb_SL2LiaEyy_uwkg"
]
}
}
There is only 1 shard without replicas, I also don't see any informations about completion type. Does someone could tell me what I'm doing wrong here?
I think that this line:
Try.of(() -> client.indices().create(createIndexRequest, RequestOptions.DEFAULT))...
is hiding an important exception.
Here you are using elasticsearch 7.0.0 which does not allow anymore giving a "type" name in your mapping.
Instead of
"mappings": {
"my_type": {
"properties": {
You should write:
"mappings": {
"properties": {
Because of the exception and the fact that probably after index creation you are indexing some documents, default index settings and mapping are applied.
Which explains what you are seeing.
You need to fix your index settings first.
I'd recommend doing that in Kibana dev console.

Elastic Search error : Custom Analyzer [custom_analyzer] failed to find tokenizer under name [my_tokenizer]

Am trying for field mapping along with my custom_analyzer & tokenizer but am getting some error.
Please find the error below that am getting from kibana while mapping fields
Custom Analyzer [custom_analyzer] failed to find tokenizer under name [my_tokenizer]
Please find my mapping details.
PUT attach_local
{
"settings": {
"analysis": {
"analyzer": {
"custom_analyzer": {
"type": "custom",
"tokenizer": "my_tokenizer",
"char_filter": [
"html_strip"
],
"filter": [
"lowercase",
"asciifolding"
]
}
}
}
},
"tokenizer": {
"my_tokenizer": {
"type": "ngram",
"min_gram": 3,
"max_gram": 3,
"token_chars": [
"letter",
"digit"
]
}
},
"mappings" : {
"doc" : {
"properties" : {
"attachment" : {
"properties" : {
"content" : {
"type" : "text",
"analyzer": "custom_analyzer"
},
"content_length" : {
"type" : "long"
},
"content_type" : {
"type" : "text"
},
"language" : {
"type" : "text"
}
}
},
"resume" : {
"type" : "text"
}
}
}
}
}
It is very important to properly indent your JSON. You'd see that your tokenizer is not properly located inside the analysis section. Here is the right definition:
{
"settings": {
"analysis": {
"analyzer": {
"custom_analyzer": {
"type": "custom",
"tokenizer": "my_tokenizer",
"char_filter": [
"html_strip"
],
"filter": [
"lowercase",
"asciifolding"
]
}
},
"tokenizer": {
"my_tokenizer": {
"type": "ngram",
"min_gram": 3,
"max_gram": 3,
"token_chars": [
"letter",
"digit"
]
}
}
}
},
"mappings": {
"doc": {
"properties": {
"attachment": {
"properties": {
"content": {
"type": "text",
"analyzer": "custom_analyzer"
},
"content_length": {
"type": "long"
},
"content_type": {
"type": "text"
},
"language": {
"type": "text"
}
}
},
"resume": {
"type": "text"
}
}
}
}
}

Categories