Is this possible to make class in avro schema that have one of his parameter as themself?
Example in java:
public class Example {
private Integer value;
private Example example;
}
Avro schema is not defined in java, but in a json file usually with .avsc file extension. Here's an example of a recursive avro schema that represents a tree:
{
"type": "record",
"name": "Node",
"fields": [
{
"name": "value",
"type": "long"
},
{
"name": "children",
"type": { "type": "array", "items": "Node" }
}
]
}
So yes, it is perfectly possible to create recursive schemas.
See also this issue, where even a shorter schema is defined:
{
"type": "record",
"name": "RecursiveRecord",
"fields": [{"name": "child", "type": "RecursiveRecord"}]
}
Related
I am learning avro schemas and i tried to make a little project. It seems i am stuck. I tried looking on documentation also but it seems confusing a lot.
Let's assume I have to make a schema for this class
class example implements Serializable {
private Object data;
}
What would be the corresponding avro schema(.avsc) for it?
I used reflect to get schema and got the corresponding avsc for it but when you do mvn compile, it just throws errors
{
"type": "record",
"name": "example",
"fields": [{
"name": "x1",
"type": {
"type": "map",
"values": {
"type": "record",
"name": "Object",
"namespace": "java.lang",
"fields": []
}
}
}, {
"name": "data",
"type": "java.lang.Object"
}
]
}
we have a program that will use ElasticSearch. We have the need to query using joins, which is not supported in elasticsearch, so we are left with either nested or parent-child relationships. I have read that using parent-child can cause significant performance issues, so we are thinking of going with nested documents.
We index/query on products but we also have customers and vendors. So, this is my thinking for my product mapping:
{
"mappings" : {
"products" : {
"dynamic": false,
"properties" : {
"availability" : {
"type" : "text"
},
"customer": {
"type": "nested"
},
"vendor": {
"type": "nested"
},
"color" : {
"type" : "text"
}
},
"created_date" : {
"type" : "text"
}
}
}
}
}
Here customer and vendor are my mapped fields.
Does this mapping look correct? Since I am setting dynamic to false, do I need to specify the contents of the customer and vendor sub documents? If so, how would I do that?
My team found parent/child relationships to be incredibly detrimental to our performance, so I think you're probably making a good decision to use nested fields.
If you use dynamic: false then undefined fields will not be added to the mapping. You can either set it to true and those fields should be added as you index or you can define the properties on the nested documents yourself:
{
"mappings" : {
"products" : {
"dynamic": false,
"properties" : {
...
"customer": {
"type": "nested",
"properties": {
"prop a": {...},
"prop b": {...}
}
},
"vendor": {
"type": "nested",
"properties": {
"prop a": {...},
"prop b": {...}
}
},
...
}
}
}
}
For example, I have a JSON schema looks as following:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"properties": {
"billing_address": { "$ref": "#/definitions/address" },
"shipping_address": { "$ref": "#/definitions/address" }
}
"definitions": {
"address": {
"type": "object",
"properties": {
"street_address": { "type": "string" },
"city": { "type": "string" },
"state": { "type": "string" }
},
"required": ["street_address", "city", "state"]
}
}
}
This schema indicate an object with two vairable billing_address and shipping_address, both of them are of type address, which contains three properties: street_address, city and state.
Now I got another "larger" schema:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"properties": {
"billing_address": { "$ref": "#/definitions/address" },
"shipping_address": { "$ref": "#/definitions/address" },
"new_address": { "$ref": "#/definitions/address" }
}
"definitions": {
"address": {
"type": "object",
"properties": {
"street_address": { "type": "string" },
"city": { "type": "string" },
"state": { "type": "string" },
"zip_code": { "type": "string" }
},
"required": ["street_address", "city", "state"]
}
}
}
As you can see, I added a new property new_address into the schema, and in address there is a new property called zip_code, which is not a required property.
So if I created an object from the old JSON schema, it should also be available for the new JSON schema. In this case, we will call the new schema is compatible with the old one. (In another word, the new schema is extension of the old one, but no modification.)
The question is how can I judge if a schema is compatible with another in Java? Complicated case should also be taken care, for example "minimum" property for a number field.
Just test it. In my current project, I am writing following contract tests:
1) having Java domain object, I serialize it to JSON and compare it to reference JSON data. I use https://github.com/skyscreamer/JSONassert for comparing two JSON strings.
For reference JSON data, you need to use 'smaller schema' object.
2) having sample JSON data, I deserialize it to my domain object, and verify if deserialization was succesfull. I compare deserialization result with model object. For sample JSON data, you shoud use your 'larger schema' object.
This test verifies if 'larger schema' JSON data is backward compatible with your 'smaller schema' domain.
I write those test at each level of my domain model -one for top-level object, and another one for each non-trivial nested object. That requires more test code and more JSON sample data, but gives much better confidence. If something fails, error messages will be fine-tuned, you will know exactly what level of hierarchy is broken (JSONAssert error messages may have many errors and be non trivial to read for deeply nested object hierarchies). So it's a trade-off between
* time spend to maintain test code and data
* quality of error messages
Such tests are fast- they need just JSON serialization/deserialization.
https://github.com/spring-cloud/spring-cloud-contract will help you writing contract test for REST APIs, messaging, etc- but for simple cases procedure I given above may be good enough
I currently have the following POJO.
#Document(indexName="ws",type="vid")
public class Vid {
#Id
private String id;
#Field(type=FieldType.String, index=FieldIndex.not_analyzed)
private List<String> tags;
}
A JSON that represents this POJO is as follows.
{
"id" : "someId",
"tags" : [ "one", "two", "three" ]
}
What I want is to define the mapping for the tags field so that I can use the values in an auto-complete search box. This is supported by Elasticsearch's Completion Suggester. The documentation at https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-completion.html seem to suggest to me that I have to set up the mapping as follows.
{
"vid": {
"properties": {
"id": {
"type": "string"
},
"tags": {
"type": "completion",
"index_analyzer": "simple",
"search_analyzer": "simple",
"payloads": true
}
}
}
}
However, that would mean that I would have to revise my POJO and JSON representation.
{
"id": "someId",
"tags": {
"input": [ "one", "two", "three" ]
}
}
I found another good page talking about Completions Suggesters here http://blog.qbox.io/quick-and-dirty-autocomplete-with-elasticsearch-completion-suggest. However, that page seem to suggest redundancy with the tags.
{
"id": "someId",
"tags": [ "one", "two", "three" ],
"tags_suggest": {
"input": [ "one", "two", "three" ]
}
}
Lastly, I found this javadoc page from spring-data-elasticsearch at http://docs.spring.io/spring-data/elasticsearch/docs/current/api/index.html?org/springframework/data/elasticsearch/core/completion/Completion.html. I am sure this class has something to do with Completion Suggesters but I don't know how to use it.
Is there any way I can just use Spring annotations to define the Elasticsearch mapping for Completion Suggester?
Absolutely yes..
you can configure your entity like this:
...
import org.springframework.data.elasticsearch.core.completion.Completion;
...
#Document(indexName = "test-completion-index", type = "annotated-completion-type", indexStoreType = "memory", shards = 1, replicas = 0, refreshInterval = "-1")
public class YoutEntity {
#Id
private String id;
private String name;
#CompletionField(payloads = true, maxInputLength = 100)
private Completion suggest;
...
}
Check this link for example.
I am not experienced with that, but maybe this annotation can be helpful for you:
Link to Spring Data Elasticsearch documentation
my question is about the fact that i want to use the same class to deserialize and re-serialize two different Jsons. I try to explain better.
I've these Jsons:
//JSON A
{
"flavors": [
{
"id": "52415800-8b69-11e0-9b19-734f1195ff37",
"name": "256 MB Server",
"ram": 256,
"OS-FLV-DISABLED:disabled":true
"links": [
{
"rel": "self",
"href": "http://www.myexample.com"
},
{
"rel": "bookmark",
"href":"http://www.myexample.com"
}
]
},
...
}
//JSON B
{
"flavors": [
{
"id": "52415800-8b69-11e0-9b19-734f1195ff37",
"name": "256 MB Server",
"links": [
{
"rel": "self",
"href": "http://www.myexample.com"
},
{
"rel": "bookmark",
"href":"http://www.myexample.com"
}
]
},
...
}
As you can see JSON B has all the fields of JSON A except "ram" and
"OS-FLV-DISABLED:disabled". The classes i used are the following:
public class Flavor {
private String name;
private List<Link> links;
private int ram;
private boolean OS_FLV_DISABLED_disabled;
//constructor and getter/setter
}
#XmlRootElement
public class GetFlavorsResponse {
private List<Flavor> flavors;
//constructor and getter/setter
}
Moreover just above the getter method isOS_FLV_DISABLED_disabled i've put the annotation #XmlElement(name = "OS-FLV-DISABLED:disabled")
otherwise Jackson doesn't recognize this property.
Here is the scheme of the situation:
When i receive JSON A there are no problems, JSON resultant is again JSON A; but when i receive JSON B the result of the process deserialization-serialization is:
//JSON C
{
"flavors": [
{
"id": "52415800-8b69-11e0-9b19-734f1195ff37",
"name": "256 MB Server",
"ram": 0,
"OS-FLV-DISABLED:disabled":false
"links": [
{
"rel": "self",
"href": "http://www.myexample.com"
},
{
"rel": "bookmark",
"href":"http://www.myexample.com"
}
]
},
...
}
Now as first thing i thought that Jackson sets class properties that was not in Json with
their default values, that is, 0 and false respectively for "ram" and
"OS-FLV-DISABLED:disabled". So i've put the annotation
#JsonSerialize(include=JsonSerialize.Inclusion.NON_DEFAULT)
just above Flavor class. This works but the problem is that when i receive JSON A in which "ram" and "OS-FLV-DISABLED:disabled" have as values 0 and false (possible situation), the result of the process mentioned above is JSON B since these two fields are ignored.
So established that this is not the solution for my problem, i read that some people suggest to use #JsonView or #JsonFilter but i don't understand how to apply these Jackson features in this case.
I hope i was clear and thanks you in advance for your help.
One thing you can try is that make your ram and OS_FLV_DISABLED_disabled as Integer and Boolean types respectively. By this if no values come in json for these two properties then they will be null. And use this annotation #JsonInclude(Include.NON_NULL) to avoid serializing null properties.