Elastic Search multi match gets wrong result - java

I am sending a query to Elastic Search to find all segments which has a field matching the query.
We are implementing a "free search" which the user could write any text he wants and we build a query which search this text throw all the segments fields.
Each segment which one (or more) of it's fields has this text should return
For example:
I would like to get all the segments which with the name "tony lopez".
Each segment has a field of "first_name" and a field of "last_name".
The query our service builds:
"multi_match" : {
"query": "tony lopez",
"type": "best_fields"
"fields": [],
"operator": "OR"
}
The result from the Elastic using this query is a segment which includes "first_name" field "tony" and "last_name" field "lopez", but also a segment when the "first_name" field is "joe" and "last_name" is "tony".
In this type of query, I would like to recive only the segments which it's name is "tony (first_name) lopez (last_name)"
How can I fix that issue?

Hope i'm not jumping into conclusions too soon but if you want to get only tony and lopez as firstname and lastname use this:
GET my_index/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"first": "tony"
}
},
{
"match": {
"last": "lopez"
}
}
]
}
}
}
But if one of your indexed documents contains for example tony s as firstname, the query above will return it too.
Why? firstname is a text datatype
A field to index full-text values, such as the body of an email or the description of a product. These fields are analyzed, that is they are passed through an analyzer to convert the string into a list of individual terms before being indexed.
More Details
If you run this query via kibana:
POST my_index/_analyze
{
"field": "first",
"text": ["tony s"]
}
You will see that tony s is analyzed as two tokens tony and s.
passed through an analyzer to convert the string into a list of individual terms (tony as a term and s as a term).
That is why the above query returns tony s in results, it matches tony.
If you want to get only tony and lopez exact match then you should use this query:
GET my_index/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"first.keyword": {
"value": "tony"
}
}
},
{
"term": {
"last.keyword": {
"value": "lopez"
}
}
}
]
}
}
}
Read about keyword datatype
UPDATE
Try this query - it is not perfect same issue with my tony s example and if you have a document with firstname lopez and lastname tony it will find it.
GET my_index/_search
{
"query": {
"multi_match": {
"query": "tony lopez",
"fields": [],
"type": "cross_fields",
"operator":"AND",
"analyzer": "standard"
}
}
}
The cross_fields type is particularly useful with structured documents where multiple fields should match. For instance, when querying the first_name and last_name fields for “Will Smith”, the best match is likely to have “Will” in one field and “Smith” in the other
cross fields
Hope it helps

Related

Search sort paginate the data inside _source of a single document in elastic search through it query

In an Elastic Search index we have a multiple document but one document is different from others. It contains vendor product mapping which no other document has, so it is one of its kind in the index. Like this:
{
"_index": "portal_support_20200911",
"_type": "_doc",
"_id": "techno_products",
"_version": 20220829,
"_seq_no": 39,
"_primary_term": 1,
"found": true,
"_source": {
"doc_name": "techno_products",
"updated_on": "20220829",
"products": [
{
"vendor": "Apple",
"product": "Iphone"
},
{
"vendor": "Samsung",
"product": "Galaxy Z"
},
{
"vendor": "Volkswagen",
"product": "Passat"
},
{
"vendor": "Volkswagen",
"product": "Tiguan"
}
]
}
}
It has thousands of vendor product mapping in "products" array.
Our requirement is to write an Elastic Search query to get the data of this document but we also want to search, sort and paginate elements of this "products" array through ES query. Means we want to search, sort and paginate _source of this ES Document through ES query.
For example sort the "products" array by "vendor" or by "product" in desc order.
Since this array has thousands of entries so we want to paginate and get 100 elements at a time then in next page next 100 and so on.
We also want to give search option like vendor=Volkswagen, so it gives only matching elements in output from Elastic Search query.
I am new to ES but as per my knowledge we can search, sort and paginate documents in ES by their fields.
But can we also search, sort and paginate data inside _source of a document in Elastic Search? How can I achieve this using Elastic search query?
Please help me in this thanks.
I think you're looking for inner hits?
If you have a nested query, you can add inner_hits as option:
{
nested: {
path: "products",
query: {
term: {
"field": "term"
}
},
inner_hits: {
size: 10000,
sort: [
{
"products.product": "desc",
},
]
}
}
}
If you never use a nested query to retrieve your documents, I think you can add the nested query anyway and use match_all: {} inside the nested query:
{
nested: {
path: "products",
query: {
match_all: {}
},
inner_hits: {
size: 10000,
sort: [
{
"products.product": "desc",
},
]
}
}
}
You can use from in inner_hits to page through the results.
I hope this helps.

DynamoDB Nested Query for Set of Object

I have a table called Group and it will have records like:
{
"id": "UniqueID1",
"name": "Ranjeeth",
"emailIdMappings": [
{
"emailId": "r.pt#r.com",
"userId": 324
},
{
"emailId": "r1.pt#r.com",
"userId": 325
}
]
},
{
"id": "UniqueID2",
"name": "Ranjeeth",
"emailIdMappings": [
{
"emailId": "r1.pt#r.com",
"userId": 325
},
{
"emailId": "r2.pt#r.com",
"userId": 326
}
]
}
I need to query and get result if emailId contains the input string.
I have reached so far and I am not able to get the result
AttributeValue attributeValue = new AttributeValue("r.pt#r.com");
Condition containsCondition = new Condition()
.withComparisonOperator(ComparisonOperator.CONTAINS)
.withAttributeValueList(attributeValue);
Map<String, Condition> conditions = newHashMap();
conditions.put("emailIdMappings.emailId", containsCondition);
ScanRequest scanRequest = new ScanRequest()
.withTableName("Group")
.withScanFilter(conditions);
amazonDynamoDB.scan(scanRequest)
dynamoDBMapper.marshallIntoObjects(Group.class, scanResult.getItems());
For the above code I am expecting record with id UniqueID1, but it's empty. If you pass "r1.pt#r.com" then you should get both records.
sdk used is com.amazonaws:aws-java-sdk-dynamodb:1.11.155
I tried posting the question in aws forum which didn't help much.
As you have List of Objects which has two attributes in a object (i.e. emailId and userId), you need to provide both values in order to match the item.
The CONTAINS function will not be able to match the item if the object has two attributes and only one attribute value mentioned in the filter expression.
Otherwise, you need to provide the occurrence (i.e. index) of the list to match the item.
Example:-
emailIdMappings[0].emailId = :emailIdVal

Elasticsearch lowercase tokenizer quirk?

I am testing mapping for url-s in elasticsearch.
I want to be able to search for entry both by domain name with tld (e.g. example.com)
and without tld (e.g example) and for full domain document to be returned
(like, http://example.com and www.example.com and similar)
I PUT this mapping to ES - in Sense:
PUT /en_docs
{
"mappings": {
"url": {
"properties": {
"content": {
"type": "string",
"analyzer" : "urlzer"
}
}
}
},
"settings": {
"analysis": {
"analyzer": {
"urlzer": {
"type": "custom",
"tokenizer": "lowercase",
"filter": [ "stopwords_filter" ]
}
},
"filter" : {
"stopwords_filter" : {
"type" : "stop",
"stopwords" : ["http", "https", "ftp", "www"]
}
}
}
}
}
Now, when I index an url document, e.g
POST /en_docs/url
{
"content": "http://example.com"
}
I can get it by searching example.com but just example doesnt return anything.
The lowercase tokenizer im using in my analyzer, as docs say, and as direct testing of my analyzer shows, gives example and com tokens, but when I do the search for indexed document, example returns nothing:
GET /en_docs/url/_search?q=example
gets no results, but if the query is example.com, result is returned.
What am I missing?

How to use queryString() in elasticsearch (java API)?

I am working on elastic-search v1.1.1
I faced a problem with search queries .I want to know How solve below obstacle
Here is my mapping
{
"token" : {
"type" : "string"
}
}
Data in indexed record is
{
token : "4r5etgg-kogignjj-jdjuty687-ofijfjfhf-kdjudyhd"
}
My search is
4r5etgg-kogignjj-jdjuty687-ofijfjfhf-kdjudyhd
I want exact match of the record ,which query I need to use to get exact match of the record
can it be done
QueryBuilders.queryString() ?
I checked with queryString() ,then I finalized its not useful for exact match
Please suggest me
You can put quotes around the string to do an exact match:
QueryBuilders.queryString("\"4r5etgg-kogignjj-jdjuty687-ofijfjfhf-kdjudyhd\"");
If you don't want partial matches on the above string index an untokenized version of the value and search on that. In you mapping add:
"token": {
"type": "multi_field",
"fields": {
"untouched": {
"type": "string",
"index": "not_analyzed"
}
}
}
Then search:
{
"query": {
"match": {
"token.untouched": "4r5etgg-kogignjj-jdjuty687-ofijfjfhf-kdjudyhd"
}
}
}
Change the mapping so ElasticSearch doesn't touch your data while indexing like so to:
{
"token" : {
"type" : "string",
"index": "not_analyzed"
}
}
And then run a TermQuery from java like this
QueryBuilders.termQuery("token", "4r5etgg-kogignjj-jdjuty687-ofijfjfhf-kdjudyhd");
That should give you your exact match.

elasticsearch select which field use for boost

Given an elasticsearch document like this:
{
"name": "bob",
"title": "long text",
"text": "long text bla bla...",
"val_a1": 0.3,
"val_a2": 0.7,
"val_a3": 1.1,
...
"val_az": 0.65
}
I need to make a search on Elastisearch with a given boost value on text field plus a boost value on document got from a named field val_xy.
In example, a search could be:
"long" with boost value on text: 2.0 and general boost val_a6
So if found "long" on text field I use a boost of 2.0, and using a boost value from field value val_a6.
How can I do this search on a java Elasticsearch client? It's possible?
What you want is a function_score query. The documentation isn't the best and can be highly confusing. But using your example above you'd do something like the following:
"function_score": {
"query": {
"term": {
"title": "long"
}
},
"functions": [
{
"filter": {
"term": {
"title": "long"
}
},
"script_score": {
"script": "_score*2.0*doc['val_a6'].value"
}
}
],
"score_mode": "max",
"boost_mode": "replace"
}
My eureka moment with function_score queries was figuring out you could do filters, including bool filters, within the "functions" part.

Categories