JSONSchema parsing and processing in Java - java

There is a perfect .NET library Json.NET Schema. I use it in my C# application to parse schemas and make a Dictionary<string, JSchema> with pairs "name_of_simple_element" - "simple_element". Then I process each pair and for example try to find "string" type elements with pattern "[a-z]" or "string" elements with maximumLength > 300.
Now I should create application with same functions in Java. It is very simple in C#:
Jschema schema = JSchema.Parse(string json);
IDictionary<string, JSchema> dict = schema.Properties;
... etc.
But i cant find same way to do that in Java. I need to convert this
{
"$schema": "http://json-schema.org/draft-04/schema#",
"id": "http://iitrust.ru",
"type": "object",
"properties": {
"regions": {
"id": "...regions",
"type": "array",
"items": {
"id": "http://iitrust.ru/regions/0",
"type": "object",
"properties": {
"id": {
"id": "...id",
"type": "string",
"pattern": "^[0-9]+$",
"description": "Идентификатор региона"
},
"name": {
"id": "...name",
"type": "string",
"maxLength": 255,
"description": "Наименование региона"
},
"code": {
"id": "...code",
"type": "string",
"pattern": "^[0-9]{1,3}$",
"description": "Код региона"
}
},
"additionalProperties": false,
"required": ["id",
"name",
"code"]
}
}
},
"additionalProperties": false,
"required": ["regions"]
}
to pseudocode dictionary/map like this
["...id" : "id": { ... };
"...name" : "name": { ... };
"...code": "code": { ... }]
What is the best way to do that?

Ok, problem is resolved by Jackson library. Code below is based on generally accepted rule that JSON Schema object is always has a "properties" element, "array" node is always has a "items" element, "id" is always unique. This is my customer's standart format. Instead of a C#'s Dictionary<string, Jschema> I have got a Java's HashMap<String, JsonNode>.
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
...
static Map<String, JsonNode> elementsMap = new HashMap<>();
public static void Execute(File file) {
ObjectMapper mapper = new ObjectMapper();
JsonNode root = mapper.readTree(file);
JsonNode rootNode = root.path("properties");
FillTheElementMap(rootNode);
}
private static void FillTheElementMap(JsonNode rootNode) {
for (JsonNode cNode : rootNode){
if(cNode.path("type").toString().toLowerCase().contains("array")){
for(JsonNode ccNode : cNode.path("items")){
FillTheElementMap(ccNode);
}
}
else if(cNode.path("type").toString().toLowerCase().contains("object")){
FillTheElementMap(cNode.path("properties");
}
else{
elementsMap.put(cNode.path("id").asText(), cNode);
}
}

A good option for you should be this Java implementation of JSONPath.
import com.jayway.jsonpath.DocumentContext;
import com.jayway.jsonpath.JsonPath;
import net.minidev.json.JSONObject;
...
DocumentContext context = JsonPath.parse(jsonSchemaFile);
//"string" elements with maximumLength == 255
List<Map<String, JSONObject>> arr2 = context.read(
"$..[?(#.type == 'string' && #.maxLength == 255)]");
And if you want to create a JsonSchema from Java code, you could use jackson-module-jsonSchema.
If you want to validate a JsonSchema, then the library from fge is an option: json-schema-validator

You may want to take a look at this library, it's helped me with similar requirements. With a couple lines of code you can traverse a pretty straightforward Java object model that describes a JSON schema.
https://github.com/jimblackler/jsonschemafriend (Apache 2.0 license)
From the README:
jsonschemafriend is a JSON Schema loader and validator, delivered as a Java library...
It is compatible with the following metaschemas
http://json-schema.org/draft-03/schema#
http://json-schema.org/draft-04/schema#
http://json-schema.org/draft-06/schema#
http://json-schema.org/draft-07/schema#
https://json-schema.org/draft/2019-09/schema

Related

OpenApi specification generator - Supply values from multiple Enum classes for a String field

I'm writing a Spring Boot application in Kotlin, and I'm currently struggling to generate a specification for a DTO class that has a backing field of the type String, which I want to then later parse into one of two enum classes in the adapter layer.
I've tried the following approach using the oneOf Annotation value, which seemed like it does what I want:
data class MyDto(
#Schema(
type = "string",
oneOf = [MyFirstEnum::class, MySecondEnum::class]
)
val identifier: String,
val someOtherField: String
) {
fun transform() { ... } // this will use the string identifier to pick the correct enum type later
}
Which results in the following OpenApi Spec:
"MyDto": {
"required": [
"someOtherField",
"identifier"
],
"type": "object",
"properties": {
"identifier": {
"type": "object", // <--- this should be string
"oneOf": [{
"type": "string",
"enum": [
"FirstEnumValue1",
"FirstEnumValue2",
"FirstEnumValue3"
]
}, {
"type": "string",
"enum": [
"SecondEnumValue1",
"SecondEnumValue2",
"SecondEnumValue3"
]
}
]
},
"someOtherField": {
"type": "string"
}
}
}
As you can see, the enum constants are (I think) correctly inlined into the specification, but the type annotation on the field, which I set to string is bypassed, resulting in an object type, which I suppose is incorrect in this case.
My questions are:
Is my current code and the resulting spec valid with the object declaration instead of string?
Is there a better way to embed the enum values into the spec?
Edited to add: I'm using Spring Boot v2.7.8 in combination with springdoc-openapi v1.6.13 to automatically generate the OpenApi Spec.
The annotation based approach that I showed in my question does not seem to generate a valid OpenApi spec with springdoc-openapi:1.6.13. The type of the field identifier needs to be String, as Helen mentioned in the comments.
I was able to solve the issue by creating the Schema for this particular class manually, using a GlobalOpenApiCustomizer Bean:
#Bean
fun myDtoCustomizer(): GlobalOpenApiCustomizer {
val firstEnum = StringSchema()
firstEnum.description = "First Enum"
MyFirstEnum.values().forEach { firstEnum.addEnumItem(it.name) }
val secondEnum = StringSchema()
secondEnum.description = "Second Enum"
MySecondEnum.values().forEach { secondEnum.addEnumItem(it.name) }
return GlobalOpenApiCustomizer {
it.components.schemas[MyDto::class.simpleName] = ObjectSchema()
.addProperty(
MyDto::identifier.name,
StringSchema().oneOf(
listOf(
firstEnum,
secondEnum
)
)
)
.addProperty(MyDto::someOtherField.name, StringSchema())
}
}
Which in turn produces the following Spec:
"MyDto": {
"type": "object",
"properties": {
"identifier": {
"type": "string",
"oneOf": [{
"type": "string",
"description": "First Enum",
"enum": [
"FirstEnumValue1",
"FirstEnumValue2",
"FirstEnumValue3"
]
}, {
"type": "string",
"description": "Second Enum",
"enum": [
"SecondEnumValue1",
"SecondEnumValue2",
"SecondEnumValue3"
]
}
]
},
"someOtherField": {
"type": "string"
}
}
}

Is there a way to extract JSON property labels from a JSON schema using Gson?

What I want to do is have Gson Type Adapters that don't have to have the property labels for a given JSON object hard coded into the adapter, but instead pull the Strings from the JSON Schema.
So if I have a schema like this (borrowing from json-schema.org):
{
"$id": "/schemas/address",
"type": "object",
"properties": {
"street_address": { "type": "string" },
"city": { "type": "string" },
"state": { "type": "string" }
}
}
Is there a way to extract the property names "street_address", "city", and "state" from the schema and assign them to variables in a Gson TypeAdapter or Factory such that I don't have to declare a String like
String streetName = "street_address";
But could instead do something in the vein of
String streetName = getSchemaProperty("/schemas/address").getProperty(0);
Where getSchemaProperty() would get the object schema from the "/schemas/address" file and getProperty would return the first property label in the schema. That way, if the schema was updated with new property labels, I would not have to update the Adapters with new Strings.
I can certainly write something that would do this (parse the schema file(s) and extract that information), I'm just wondering if that kind of work has already been done (maybe with annotations or some such?) & I'm just missing it?
Easier than I thought, actually, since a Schema is a JSON file.
Here is an example Schema:
{
"$id": "/path/to/schema/MySchema",
"$schema": "http://json-schema.org/draft/2019-09/schema#",
"title": "My Example JSON Schema",
"description": "JSON Schema for me",
"$comment": "Hey! A Comment!",
"$defs" :
{
"My Class":
{
"$comment": "Another Comment! I love comments, so useful!",
"type": "object",
"required": [ "Value", "Description" ],
"properties":
{
"Value": { "type": "number" },
"Description": { "type": "string" }
}
}
}
}
Here is the default constructor of my Google GSON Type Adapter:
public MyTypeAdapter()
{
InputStream is = getClass().getResourceAsStream("/path/to/schema/MySchema.json");
InputStreamReader isr= new InputStreamReader(is, "UTF-8");
JsonObject schemaJO= JsonParser.parseReader(isr).getAsJsonObject;
JsonObject thisJO = schemaJO.getAsJsonObject("$defs").getAsJsonObject("My Class").getAsJsonObject("properties");
//Get the list of property labels\names\keys
Set<String> keySet = thisJO .keySet();
//loop over the set and distribute the Strings as desired.
}
Easy Peasy Lemon Squeezy!

How can I split and print a section of JSON/String?

I have a JSON Schema fetched from a DB, Which is now stored in a string in Java. I want to print only a section of schema but not all. How can I split the JSON/String and print.
I have tried converting the String back to JSON format. But not sure how to separate the required content. Also split method didn't worked for me as well.
Input:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"properties": {
"employee_id": {
"type": "string"
},
"course_id": {
"type": "string"
},
"college_id": {
"type": "string"
}
},
"required": [
"employee_id",
"course_id",
"college_id"
]
}
Expected Result:
employee_id, course_id, college_id
As your question doesn't provide any details on which library you are using to parse the JSON document, I have put together some approaches using popular JSON parsing libraries for Java.
JsonPath
It is pretty straightforward to be achieved with JsonPath:
List<String> required = JsonPath.parse(json).read("$.required");
Jackson
It also could be achieved with Jackson:
ObjectMapper mapper = new ObjectMapper();
List<String> required = mapper.convertValue(mapper.readTree(json).get("required"),
new TypeReference<List<String>>() {});
Gson
In case you prefer Gson:
Gson gson = new Gson();
JsonObject jsonObject = gson.fromJson(json, JsonObject.class);
List<String> required = gson.fromJson(jsonObject.getAsJsonArray("required"),
new TypeToken<List<String>>() {}.getType());
JsonPath with Jackson or Gson
Depending on your needs, you could combine JsonPath with Jackson or Gson:
Configuration conf = Configuration.builder()
.jsonProvider(new JacksonJsonProvider())
.mappingProvider(new JacksonMappingProvider())
.build();
Configuration conf = Configuration.builder()
.jsonProvider(new GsonJsonProvider())
.mappingProvider(new GsonMappingProvider())
.build();
List<String> required = JsonPath
.using(conf)
.parse(json)
.read("$.required", new TypeRef<List<String>>() {});
String str=
"{
"key":{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"properties": {
"employee_id": {
"type": "string"
},
"course_id": {
"type": "string"
},
"college_id": {
"type": "string"
}
},
"required": [
"employee_id",
"course_id",
"college_id"
]
}
}
";
JSONObject jsonObj=new JSONObject(str);
JSONObject keyJon= jsonObj.getJSONObject("key");
String strUrl=keyJon.getString("$schema");
System.err.println("str "+strUrl);
I've made a helper library that uses gson and has ability to search json for subtrees/elements:
https://github.com/Enerccio/gson-utilities
In your case you would do
List<JsonElement> contents = JsonHelper.getAll(data, "key.required", x -> x instanceof JsonPrimitive);
System.out.print(contents.stream().map(JsonElement::getAsString).collect(Collectors.joining(", ")));
But you have invalid JSON, the valid version would be:
{
"key":{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"properties": {
"employee_id": {
"type": "string"
},
"course_id": {
"type": "string"
},
"college_id": {
"type": "string"
}
},
"required": [
"employee_id",
"course_id",
"college_id"
]
}
}
The below-mentioned method solved my problem.
JSONObject jsonObject = new JSONObject(key);
JSONArray req = jsonObject.getJSONArray("required");
System.out.println("Required Parameters : "+req);

Merge two avro schemas programmatically

I have two similar schemas where only one nested field changes (it is called onefield in schema1 and anotherfield in schema2).
schema1
{
"type": "record",
"name": "event",
"namespace": "foo",
"fields": [
{
"name": "metadata",
"type": {
"type": "record",
"name": "event",
"namespace": "foo.metadata",
"fields": [
{
"name": "onefield",
"type": [
"null",
"string"
],
"default": null
}
]
},
"default": null
}
]
}
schema2
{
"type": "record",
"name": "event",
"namespace": "foo",
"fields": [
{
"name": "metadata",
"type": {
"type": "record",
"name": "event",
"namespace": "foo.metadata",
"fields": [
{
"name": "anotherfield",
"type": [
"null",
"string"
],
"default": null
}
]
},
"default": null
}
]
}
I am able to programatically merge both schemas using avro 1.8.0:
Schema s1 = new Schema.Parser().parse(schema1);
Schema s2 = new Schema.Parser().parse(schema2);
Schema[] schemas = {s1, s2};
Schema mergedSchema = null;
for (Schema schema: schemas) {
mergedSchema = AvroStorageUtils.mergeSchema(mergedSchema, schema);
}
and use it to convert an input json into an avro or json representation:
JsonAvroConverter converter = new JsonAvroConverter();
try {
byte[] example = new String("{}").getBytes("UTF-8");
byte[] avro = converter.convertToAvro(example, mergedSchema);
byte[] json = converter.convertToJson(avro, mergedSchema);
System.out.println(new String(json));
} catch (AvroConversionException e) {
e.printStackTrace();
}
That code shows the expected output: {"metadata":{"onefield":null,"anotherfield":null}}. The issue is that I am not able to see the merged schema. If I do a simple System.out.println(mergedSchema) I get the following exception:
Exception in thread "main" org.apache.avro.SchemaParseException: Can't redefine: merged schema (generated by AvroStorage).merged
at org.apache.avro.Schema$Names.put(Schema.java:1127)
at org.apache.avro.Schema$NamedSchema.writeNameRef(Schema.java:561)
at org.apache.avro.Schema$RecordSchema.toJson(Schema.java:689)
at org.apache.avro.Schema$RecordSchema.fieldsToJson(Schema.java:715)
at org.apache.avro.Schema$RecordSchema.toJson(Schema.java:700)
at org.apache.avro.Schema.toString(Schema.java:323)
at org.apache.avro.Schema.toString(Schema.java:313)
at java.lang.String.valueOf(String.java:2982)
at java.lang.StringBuilder.append(StringBuilder.java:131)
I call it the avro uncertainty principle :). It looks like avro is able to work with the merged schema, but it fails when it tries to serialize the schema to JSON. The merge works with simpler schemas, so it sounds like a bug in avro 1.8.0 to me.
Do you know what could be happening or how to solve it? Any workaround (ex: alternative Schema serializers) is welcome.
I found the same issue with the pig util class... actually there are 2 bugs here
AVRO allows serialize data through GenericDatumWriter using an invalid schema
The piggybank util class is generating invalid schemas because it is using the same name/namespace for all the merged fields (instance of keep the original name)
This is working properly for more complex scenarios https://github.com/kite-sdk/kite/blob/master/kite-data/kite-data-core/src/main/java/org/kitesdk/data/spi/SchemaUtil.java#L511
Schema mergedSchema = SchemaUtil.merge(s1, s2);
From your example, I am getting the following output
{
"type": "record",
"name": "event",
"namespace": "foo",
"fields": [
{
"name": "metadata",
"type": {
"type": "record",
"name": "event",
"namespace": "foo.metadata",
"fields": [
{
"name": "onefield",
"type": [
"null",
"string"
],
"default": null
},
{
"name": "anotherfield",
"type": [
"null",
"string"
],
"default": null
}
]
},
"default": null
}
]
}
Hopefully this will help others.
Merge schema facility is not ssupported for avro files yet.
But lets say if you are having avro files in one directory with multiple avro files which has different schemas eg: /demo so you can read it through spark using. and provide one master schema file (i.e .avsc file) so spark will internally read all the records from the file and if any one file has missing column so it will show null value.
object AvroSchemaEvolution {
def main(args: Array[String]): Unit = {
val schema = new Schema.Parser().parse(new File("C:\\Users\\murtazaz\\Documents\\Avro_Schema_Evolution\\schema\\emp_inserted.avsc"))
val spark = SparkSession.builder().master("local").getOrCreate()
val df = spark.read
.format("com.databricks.spark.avro").option("avroSchema", schema.toString)
.load("C:\\Users\\murtazaz\\Documents\\Avro_Schema_Evolution\\demo").show()
}
}

Having trouble deserializing JSON response with gson

I am using an API where I supply an input string, and it returns some keyword autocompletions and product nodes.
My goal is to deserialize the response and get a list of the autocompletion Strings I can use. I'm trying implement this in an android application with the Retrofit library, which uses gson.
First off, I'm not sure the response I have is a typical JSON response. The 'nodes' item has key / value pairs, but the input string and the autocompletions list don't seem to have keys I can use.
["pol",
["polaroid camera",
"polo",
"polo ralph lauren",
"polo ralph lauren men",
"polar heart rate monitor",
"polaroid",
"polo shirt",
"polar watch",
"police scanner",
"polar"],
[{
"nodes": [{
"alias": "electronics",
"name": "Electronics"
},
{
"alias": "electronics-tradein",
"name": "Electronics Trade-In"
}]
},
{
},
{
},
{
},
{
},
{
},
{
},
{
},
{
},
{
}],
[]]
This is my attempt at the java classes for gson to deserialize to. However, it doesn't work as from what I understand, gson needs the class variables to match the JSON keys (true for Node class but not the rest).
class Response {
String input;
List<String> keywords;
List<Node> nodes;
}
class Node {
String alias;
String name;
}
the json only has a couple of keys in it, this is largely a Json Array.
if you can change the JSON, make it more like this
{
"input" : "pol",
"keywords" : ["polaroid camera","polo",...],
"nodes": [{
"alias": "electronics",
"name": "Electronics"
},
{
"alias": "electronics-tradein",
"name": "Electronics Trade-In"
}]
}

Categories