Why do we create mapping in elasticsearch while setting up repository?

Why do we create mapping in elasticsearch while setting up repository? - java

Okay, I got it this question that what is the need for mapping.
Now I am going through a piece of code, what they are doing is that they are generating the mapping while creating the elastic search repository by pushing a dummy object and then deleting it.
I got it that elastic search can generate mappings, but what is the point of doing so. It does not help with the search queries ( at least the regex one that I have tried unless you explicitly tell in your mapping that this is of type keyword).
I would be thankful if someone can explain this.

Although Elasticsearch generates the mapping when you don't define one, and just index the document, but that way Elasticsearch generates the mapping based on the first document data, for example you have product-id field in your index, and if you index it without defining explicit mapping, Elasticsearch generates two data-type, one is text and another is keyword for this field when you index product-id as below.
{
"product-id" : "1"
}
Now, it depends on your use-case, let's suppose in your case, product-id is keyword and fixed, and you just want to use the exact search or aggregation on the product-id field, and don't want the full-text search, than you better go with explicit mapping and define it as in keyword field, that way Elasticsearch storage and queries would be optimal. You can refer to this Stackoverflow comment, for more information on it.
Bottomline, When you want to have a greater control on how your data should be indexed, It's always better to define explicit mapping than relaying on default mapping generated by Elasticsearch.

Related

I am not able to get the difference between inFilter and TermFilter, can anyone please guide me on this?

Now as per termFilter and inFilter definitions, both are doing the same job, what is the difference ?

I think you are using the very old Elasticsearch version(As in latest version inFilter isn't present, and your link is pointing to Elasticsearch version below 1), But as mentioned in the documentation link you provided, termsFilter is useful to filter one value, while inFilter will work even if you pass many terms, and if it matches any one of them.
For ex: you have a name field, where three documents have foo, bar or baz, using termsfilter you can filter by only one value ie foo, bar, you will get only one document ,but if you use inFilter you can pass all the possible values, and get all the matching documents.
Consider them equivalent to sql statement.
name=foo (example of terms filter)
name in ('foo', 'bar', 'baz') // example of in Filter

Change Field name in elasticsearch response

I need to change field names in elastic search response (ex. change "title" to "header"). i want to avoid parsing the Json response which take much time.
is there any way to do that?

i'm afraid this might not be available in elasticsearch. you might have to parse the response. consider
Aliasing
One of the things introduced in Apache Solr 4.0 and not available in ElasticSearch right now is the ability to transform result documents. First of all Solr allows you to alias returned fields, so for example you can return field price_usd or price_eur as price depending on your needs. The second thing is the ability to return values returned by functions as a (pseudo) field in the result (or fields). Solr also has the ability to return fields which start with a given prefix (for example all fields starting with price). Apart from the ability to get a function value as a field added to matched documents on the fly other functionalities are not ground breaking, though they can be handy in some cases.
from http://blog.sematext.com/2012/10/01/solr-vs-elasticsearch-part-3-searching/

how to search similar entities in database using Example class from hibernate

i know that there are an Hibernate class called Example that we can use to get similar entities in order to do a search, but is it possible that this class permit to get entities searching in a generic way.
I explain, I build an example entity having a property called name with value = "myname", is Hibernate capable to return an entity which has property having value = "mname" ?

Yes that's possible but to enable text-level similarity you need a Lucene index to speed-up the query, as it would otherwise be extremely inefficient to run on a relational database.
This is provided by Hibernate Search, the extension of Hibernate to integrate with Lucene and manage the indexes transparently.

Best practice design pattern for defining "types" in a database with potential multi language requirement?

My question more specificity is this:
I want users on multiple front ends to see the "Type" of a database row. Let's say for ease that I have a person table and the types can be Student, Teacher, Parent etc.
The specific program would be java with hibernate, however I doubt that's important for the question, but let's say my data is modelled in to Entity beans and a Person "type" field is an enum that contains my 3 options, ideally I want my Person object to have a getType() method that my front end can use to display the type, and also I need a way for my front end to know the potential types.
With the enum method I have this functionality but what I don't have is the ability to easily add new types without re-compiling.
So next thought is that I put my types in to a config file and simply story them in the database as strings. my getType() method works, but now my front end has to load a config file to get the potential types AND now there's nothing to keep them in sync, I could remove a type from my config file and the type in the database would point to nothing. I don't like this either.
Final thought is that I create a PersonTypes database table, this table has a number for type_id and a string defining the type. This is OK, and if the foreign key is set up I can't delete types that I'm using, my front end will need to get sight of potential types, I guess the best way is to provide a service that will use the hibernate layer to do this.
The problem with this method is that my types are all in English in the database, and I want my application to support multiple languages (eventually) so I need some sort of properties file to store the labels for the types. so do I have a PersonType table the purely contains integers and then a properties file that describes the label per integer? That seems backwards?
Is there a common design pattern to achieve this kind of behaviour? Or can anyone suggest a good way to do this?
Regards,
Glen x

I would go with the last approach that you have described. Having the type information in separate table should be good enought and it will let you use all the benefits of SQL for managing additional constraints (types will be probably Unique and foreign keys checks will assure you that you won't introduce any misbehaviour while you delete some records).
When each type will have i18n value defined in property files, then you are safe. If the type is removed - this value will not be used. If you want, you can change properties files as runtime.
The last approach I can think of would be to store i18n strings along with type information in PersonType. This is acceptable for small amount of languages, altough might be concidered an antipattern. But it would allow you having such method:
public String getName(PersonType type, Locale loc) {
if (loc.equals(Locale.EN)) {
return type.getEnglishName();
} else if (loc.equals(Locale.DE)){
return type.getGermanName();
} else {
return type.getDefaultName();
}
}

Internationalizing dynamic values is always difficult. Your last method for storing the types is the right one.
If you want to be able to i18n them, you can use resource bundles as properties files in your app. This forces you to modify the properties files and redeploy and restart the app each time a new type is added. You can also fall back to the English string stored in database if the type is not found in the resource bundle.
Or you can implement a custom ResourceBundle class that fetches its keys and values from the database directly, and have an additional PersonTypeI18n table which contains the translations for all the locales you want to support.

You can use following practices:
Use singleton design pattern
Use cashing framework such as EhCashe for cashe type of person and reload when need.

How to search across multiple fields in Lucene using Query Syntax?

I'm searching a lucene index and I'm building search queries like
field1:"hello" AND field2:"world"
but I'd like to search for a value in any field as well as the values in specific fields in the same query i.e.
field1:"hello" AND anyField:"world"
Can anyone tell me how I can search across all indexed fields in this way?

Based on the answers I got for this question: Impact of repeat value across multiple fields in Lucene...
I can put the same search term into multiple fields and therefore create an "all" field which I put everything in. This way I can create a query like...
field1:"hello" AND all:"world"
This seems to work very nicely, prevents the need for huge search queries, and apparently the performance impact is minimal.

Boolean (OR) queries with a clause for each field are used to search multiple fields. The MultiFieldQueryParser will do that as well, but the fields still need to be enumerated. There's no implicit "all" fields; but IndexReader.getFieldNames can acquire them.

This might not apply to you, but in Azure Search, which is based on Lucene, using Lucene syntax, I use this:
name:plywood^100 OR plywood
Results with "plywood" in the "name" field are boosted.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Why do we create mapping in elasticsearch while setting up repository? - java

Related

I am not able to get the difference between inFilter and TermFilter, can anyone please guide me on this?

Change Field name in elasticsearch response

how to search similar entities in database using Example class from hibernate

Best practice design pattern for defining "types" in a database with potential multi language requirement?

How to search across multiple fields in Lucene using Query Syntax?

Categories

Resources