Change Field name in elasticsearch response - java

I need to change field names in elastic search response (ex. change "title" to "header"). i want to avoid parsing the Json response which take much time.
is there any way to do that?

i'm afraid this might not be available in elasticsearch. you might have to parse the response. consider
Aliasing
One of the things introduced in Apache Solr 4.0 and not available in ElasticSearch right now is the ability to transform result documents. First of all Solr allows you to alias returned fields, so for example you can return field price_usd or price_eur as price depending on your needs. The second thing is the ability to return values returned by functions as a (pseudo) field in the result (or fields). Solr also has the ability to return fields which start with a given prefix (for example all fields starting with price). Apart from the ability to get a function value as a field added to matched documents on the fly other functionalities are not ground breaking, though they can be handy in some cases.
from http://blog.sematext.com/2012/10/01/solr-vs-elasticsearch-part-3-searching/

Related

Pact JVM check for a response that is either an array or null

I am trying to create a contract between with a consumer when making a request for an item information. The item has a very complex data structure with nested properties and contains fields that ranges from strings, lists, arrays of strings, arrays of enums etc. There is also a complexity added where some of the fields may be returned as a null or populated with a value respective to its type.
What I would like to do is to create a pact response matcher that can be re-used for any item. At the moment, there are 120 items (each item is a separate API call with the item name in the query param of the request) that I could request information for and writing a pact matcher for each item is not feasible / maintainable.
E.g. Sample response cut down ( the actual response is over 60 lines):
return new PactDslJsonBody()
.object("metadata")
.stringValue("name", "item-1")
.integerType("index", 1)
.eachLike("contents")
.stringType("param", "content-param")
.stringType("value", "content-value")
.closeArray
.closeObject
The problem I have is that the field contents could also come back as null from the provider. What I am trying to achieve is whether I can within the same PactDslJsonBody have an OR condition or something similar which can check for the eachLike but also can accept that the field is null.
Contents field populated:
{
"metadata":{
"name":"item-1",
"index":1,
"contents":[
{
"param":"content-param",
"value":"content-value"
}
]
}
}
Contents field null:
{
"metadata":{
"name":"item-1",
"index":1,
"contents":null
}
}
I see there is PactDslJsonBody.or() available but the usage of this is not documented clearly, hence I am unsure if this is the intended usage for the above use case.
Writing separate tests is not feasible as this is a specific use case to repeat the same PactDslJsonBody for n number of consumer tests, one for each item. We need to test each item as we have a direct mapping of some field values as enums between the consumer and provider hence we want to make sure that each item has the expected values for such fields. We also dont know in advance which items will this field as null and which ones have it populated, hence why we wanted to have the option to use something like orNull etc.
I have looked quite a lot in the web and couldn't find a definitive answer to the above. Any help on this would be appreciated on how to move forward with this. We really want to use Pact since it is the go to tool for us and the above is a crucial use case for us to test.
I am using Pact JVM version 4.3.15 and using V3 specification.
Thanks.
See https://docs.pact.io/faq#why-is-there-no-support-for-specifying-optional-attributes. You need to write two separate test cases that cover both scenarios.

Why do we create mapping in elasticsearch while setting up repository?

Okay, I got it this question that what is the need for mapping.
Now I am going through a piece of code, what they are doing is that they are generating the mapping while creating the elastic search repository by pushing a dummy object and then deleting it.
I got it that elastic search can generate mappings, but what is the point of doing so. It does not help with the search queries ( at least the regex one that I have tried unless you explicitly tell in your mapping that this is of type keyword).
I would be thankful if someone can explain this.
Although Elasticsearch generates the mapping when you don't define one, and just index the document, but that way Elasticsearch generates the mapping based on the first document data, for example you have product-id field in your index, and if you index it without defining explicit mapping, Elasticsearch generates two data-type, one is text and another is keyword for this field when you index product-id as below.
{
"product-id" : "1"
}
Now, it depends on your use-case, let's suppose in your case, product-id is keyword and fixed, and you just want to use the exact search or aggregation on the product-id field, and don't want the full-text search, than you better go with explicit mapping and define it as in keyword field, that way Elasticsearch storage and queries would be optimal. You can refer to this Stackoverflow comment, for more information on it.
Bottomline, When you want to have a greater control on how your data should be indexed, It's always better to define explicit mapping than relaying on default mapping generated by Elasticsearch.

Can we ever have Documents with different fields in a single Lucene index?

This question has cropped up in my mind because, when constructing the output from a query's results, I want to make sure that I extract all the Fields (and not try extracting non-existent ones) from the Documents in the TopDocs... in anticipation of finding indices which contain Documents from an older version of my app.
The interesting/curious thing is that you have a method Document.getFields(). I.e. the Document class, rather than, for example, IndexSearcher or DirectoryReader, is responsible for telling us the Fields used. Theoretically, therefore, you could store and later retrieve Documents with a different set of Fields.
At this stage of my TDD I am just going to test that fields are extracted from the first Document in the TopDocs, and assume that all the others have the same fields.
But are there any use cases where one might have Documents with differing fields?

Parsing a cypher query result (JSON) to a Java object

I use the jersey/jackson stack to address a neo4j database via the REST api, but I have some issues how to interpret the result.
If I read the node by its ID (/db/data/node/xxx) the result can be mapped to my DTO very easy by calling readEntity(MyDto.class) on the response. However, usage of internal IDs is not recommended and various use cases require to query by custom properties. Here cypher comes into play (/db/data/cypher).
Assuming a node exists with a property "myid" and a value of "1234", I can fetch it with the cypher query "MATCH (n {myid: 1234}) RETURN n". The result is a JSON string with a bunch of resources and eventually the "data" I want do unmarshall to a java object. Unmarshalling it directly fails with a ProcessingException (error reading entity from input stream). I see no API allowing to iterate the result's data.
My idea is to define some kind of generic wrapper class with an attribute "data", giving this one to the unmarshaller, and unwrapping my DTO afterwards. I wonder if there is a more elegant way to do this, like using "RETURN n.data" (which does not work) or something like this. Is it?
You should look into neo4j 2.0 where return n just returns the property map.
I usually tend to deserialize the result as a nested list/map (i.e. have ObjectMapper read to Object.class or Map.class) structure and grab the data map directly out of that.
There's probably a way to tell jackson to ignore all the information around that data field
If you want to have a nicer presentation you can also check out my cypher-rs project which returns only the data in question, nothing more.

How to search across multiple fields in Lucene using Query Syntax?

I'm searching a lucene index and I'm building search queries like
field1:"hello" AND field2:"world"
but I'd like to search for a value in any field as well as the values in specific fields in the same query i.e.
field1:"hello" AND anyField:"world"
Can anyone tell me how I can search across all indexed fields in this way?
Based on the answers I got for this question: Impact of repeat value across multiple fields in Lucene...
I can put the same search term into multiple fields and therefore create an "all" field which I put everything in. This way I can create a query like...
field1:"hello" AND all:"world"
This seems to work very nicely, prevents the need for huge search queries, and apparently the performance impact is minimal.
Boolean (OR) queries with a clause for each field are used to search multiple fields. The MultiFieldQueryParser will do that as well, but the fields still need to be enumerated. There's no implicit "all" fields; but IndexReader.getFieldNames can acquire them.
This might not apply to you, but in Azure Search, which is based on Lucene, using Lucene syntax, I use this:
name:plywood^100 OR plywood
Results with "plywood" in the "name" field are boosted.

Categories