Single Token Field in elastic search - java

I want to create a field in Elasticsearch from my java code.
It's getting tokenized and therefore sorting and searching are not working as expected.
I'm trying to use this analyzer but it's not working
[
{
"name": "email",
"type": "string",
"filters": [
"lowercase",
"shingle"
],
"tokenizers": [
"uax_url_email"
]
}
]

Related

fhir.executeBundle replacing resource id...How to prevent this?

I am using this Java code to upload a resource to a FHIRstore.
The resource is as follows
{
"resourceType": "Bundle",
"id": "bundle-transaction",
"meta": {
"lastUpdated": "2018-03-11T11:22:16Z"
},
"type": "transaction",
"entry": [
{
"resource": {
"resourceType": "Patient", "id" : 123456,
"name": [
{
"family": "Smith",
"given": [
"Darcy"
]
}
],
"gender": "female",
"address": [
{
"line": [
"123 Main St."
],
"city": "Anycity",
"state": "CA",
"postalCode": "12345"
}
]
},
"request": {
"method": "POST",
"url": "Patient"
}
}
]
}
But the id i am using(123456) is getting replaced by a hexadecimal number.
This does not happen while using fhirstores.import method
Is there any way to stop executeBundle method from replacing my id...as i want to use custom id in my resource?
Any help would be appreciated.
Thank you
When you're performing a transaction, the effect is going to be the same as if you were POSTing the resources individually. On a POST, the server determines the resource id. On a regular POST, the id is just ignored or raises an error. Within a transaction, the id is used to manage resolution of references across the transaction, but the server still chooses what the id will be of the persisted resources (and updates all references accordingly). If you want to control the resource id values within a transaction, use PUT rather than POST. (Note that not all servers will allow an 'upsert' - i.e. a PUT that performs a create at a specific resource location.) For details, see http://hl7.org/fhir/http.html#upsert.

how to get json response without references with "ids"

so i have my project there is this part where there are 2 one to many relation ship to the same entity
what happens is that the response on the get request on postman come like this :
the one to many relationship is writen the same for both elements
{
"elemnt1withonetomany": {
"id": 2,
"name": "something",
"last_name": "something",
"email": "something"
},
"elemnt2withonetomany": {
"#id": 4,
"id": 4,
"code": "details",
"email": "details",
"name": "details",
"lastname": "details"
},
{
"elemnt1withonetomany": {
"id": 2,
"name": "something",
"last_name": "something",
"email": "something"
},
"element2withonetomany": 4,
}
so is there any way to make the get request gives the same form of information with elemnt2withonetomany
i kinda found where it came from but then it's gonna need a lot of JsonBackReference and similar annotations
and yep it was from the #JsonIdentityInfo on top of the entities
kinda took me a while to find the source so i'm just gonna post what i found if someone needed it
so my new question is there a way to by pass it without deleting this one

converting values between to data domains

I have to write over a hundred integrations between many systems. This integration layer must be able to convert codes. Each system uses codes to represent business types like insurance_type, customer_type, etc. Each of them has a set of valid values. Theses values are not the same from system to system, and may even vary over time.
I start looking for data domain mapping libraries in Java. I didn't found anything suitable. I thought about: CloverETL,Pentaho ETLou GETL​ but they are all way too complex for my need or not maintain.
The goal is to put the conversion rules out of the code so they could evolve over time without the need for a new executable deployment.
I'm looking for a tool, library that would allow me to represent mapping similar to this:
{
"domains" :[
{
"name": "type police host",
"values": [
{
"code" : "0001",
"description":"Habitation",
"start_date":"2019-06-30",
"end_date":""},
{
"code" : "0002",
"description":"Automobile",
"start_date":"2019-06-30",
"end_date":""}
]
},
{
"name": "type police web",
"values": [
{
"code" : "Habitation",
"description":"Habitation",
"start_date":"2019-06-30",
"end_date":""}
]
}
],
"conversions" : [
{
"from": "type police host",
"to": "type police web",
"rules" : [
{
"from": ["0001"],
"to" : "Habitation",
"start_date":"2019-06-30",
"end_date":""},
{
"from": [ "0003","0004"],
"to" : "Deux roues",
"start_date":"2019-06-30",
"end_date":""}
]
}
]
}
From the configuration file above, I would be able to do things like convertsAsOf("2019-07-10", "type police host", "type police web", "0001") and it would return "Habitation". Any suggestion of a library that would do it?

JSON parsing vs Regex based String parsing

I need to process a big JSON payload(~1MB) coming from an API, a portion of the JSON is something like this:
{
"id": "013dd2a7-fec4-4cc5-b819-f3cf16a1f820",
//more attributes
"entry_mode": "LDE",
"periods": [
{
"type": "quarter",
"id": "fe96dc03-660c-423c-84cc-e6ae535edd2d",
"number": 1,
"sequence": 1,
"scoring": {
//more attribtues
},
"events": [
{
"id": "e4426708-fadc-4cae-9adc-b7f170f5d607",
"clock": "12:00",
"updated": "2013-12-22T03:41:40+00:00",
"description": "J.J. Hickson vs. DeAndre Jordan (Blake Griffin gains possession)",
"event_type": "opentip",
"attribution": {
"name": "Clippers",
"market": "Los Angeles",
"id": "583ecdfb-fb46-11e1-82cb-f4ce4684ea4c",
"team_basket": "left"
},
"location": {
"coord_x": 572,
"coord_y": 296
},
"possession": {
"name": "Clippers",
"market": "Los Angeles",
"id": "583ecdfb-fb46-11e1-82cb-f4ce4684ea4c"
}
},
//more events
]
}
]
}
This is a nearly-realtime API that I need to process only the events, identify a set of event UUIDs, look for duplicates in the database and save new events.
I could use a JSONObject/JSONArray or use regex with string parsing to and fetch the events portion. Processing time is critical since this should be nearly-realtime and memory efficiency is important since there can be multiple payloads coming in at once.
Which one is more efficient for this use case?
Use a proper streaming JSON parser. You know what you want to pull out of the stream, you know when you can quit parsing it, so read the stream in small, manageable chunks, and quit as soon as you know you are done.
Circa 2017, I'm not aware of any browser/native JSON streaming APIs, so you'll need to find a Javascript-based streaming library. Fortunately, streaming is not a new concept, so there are a number of options already in existence:
http://oboejs.com/
https://github.com/dominictarr/JSONStream
https://github.com/creationix/jsonparse
https://github.com/dscape/clarinet
http://danieltao.com/lazy.js/demos/json/

How to convert all datatypes as String while creating an avsc file through avro-tools.jar

I am generating an avsc schema files while loading it into HDFS .
The file is getting created as below:-
"fields" : [
{ "name" : "AGGREGATE",
"type" : [ "null", "string" ],
"default" : null,
"columnName" : "AGGREGATE",
"sqlType" : "1"
},
{ "name" : "LSTIME",
"type" : [ "null", "long" ],
"default" : null,
"columnName" : "LSTIME",
"sqlType" : "93"
},
However as per my requirement i need all the datatypes to be converted to string datatype while creating a hive table .
I have tried the solution java -jar /path/to/avro-tools-1.7.7.jar compile -string schema but it is not working.
Please can anyone help.

Categories