Decoding Json data to Avro classes - java

I have some files records in which are stored as plain text Json. A sample records:
{
"datasetID": "Orders",
"recordID": "rid1",
"recordGroupID":"asdf1",
"recordType":"asdf1",
"recordTimestamp": 100,
"recordPartitionTimestamp": 100,
"recordData":{
"customerID": "cid1",
"marketplaceID": "mid1",
"quantity": 10,
"buyingDate": "1481353448",
"orderID" : "oid1"
}
}
For each record, recordData may be null. If recordData is present, orderID may be null.
I write the following Avro schema to represent the structure:
[{
"namespace":"model",
"name":"OrderRecordData",
"type":"record",
"fields":[
{"name":"marketplaceID","type":"string"},
{"name":"customerID","type":"string"},
{"name":"quantity","type":"long"},
{"name":"buyingDate","type":"string"},
{"name":"orderID","type":["null", "string"]}
]
},
{
"namespace":"model",
"name":"Order",
"type":"record",
"fields":[
{"name":"datasetID","type":"string"},
{"name":"recordID","type":"string"},
{"name":"recordGroupID","type":"string"},
{"name":"recordType","type":"string"},
{"name":"recordTimestamp","type":"long"},
{"name":"recordPartitionTimestamp","type":"long"},
{"name":"recordData","type": ["null", "model.OrderRecordData"]}
]
}]
Ans finally, I use the following method to de-serialize each String record into my Avro class:
Order jsonDecodeToAvro(String inputString) {
return new SpecificDatumReader<Order>(Order.class)
.read(null, DecoderFactory.get().jsonDecoder(Order.SCHEMA$, inputString));
}
But I keep getting the exception when trying to reach the above record:
org.apache.avro.AvroTypeException: Unknown union branch customerID
at org.apache.avro.io.JsonDecoder.readIndex(JsonDecoder.java:445)
What am I doing wrong? I am using JDK8 and Avro 1.7.7

The json input must be in the form
{
"datasetID": "Orders",
"recordID": "rid1",
"recordGroupID":"asdf1",
"recordType":"asdf1",
"recordTimestamp": 100,
"recordPartitionTimestamp": 100,
"recordData":{
"model.OrderRecordData" :{
"orderID" : null,
"customerID": "cid1",
"marketplaceID": "mid1",
"quantity": 10,
"buyingDate": "1481353448"
}
}
}
This is because of the way Avro's JSON encoding handles unions and nulls.
Take a look at this:
How to fix Expected start-union. Got VALUE_NUMBER_INT when converting JSON to Avro on the command line?
There is also an open issue regarding this:
https://issues.apache.org/jira/browse/AVRO-1582

Related

Jackson JSON Deserialization - How to assign object members based on JSON values?

I have some ugly JSON that I need to deserialize which looks like the following:
"ContainerValues": [
{
"ParentAttribute": "QuantityContained",
"RowList": [
{
"Values": [
{
"Name": "Code",
"ValuesByLocale": {
"en-US": "GRM"
},
},
{
"Name": "Value",
"ValuesByLocale": {
"en-US": "4.0"
},
}
],
}
],
}
],
This is just a sample of the JSON I have. All I need to do is to get this into a POJO which looks like something like the following:
Class POJO{
String grmValue; // This is the "Value" for the GRM "Code" above, i.e. "4.0"
...
}
Any idea how I might be able to assign the value of grmValue based on the JSON above using Jackson? I'm starting to think I'll need to write a custom deserializer.
First You have to deserialize to class similar to your JSON, then transform to your POJO format :)

How to select fields in different levels of a jsonfile with jsonPath?

I want to convert jsonobjcts into csv files. Wy (working) attempt so far is to load the json file as a JSONObject (from the googlecode.josn-simple library), then converting them with jsonPath into a string array which is then used to build the csv rows. However I am facing a problem with jsonPath. From the given example json...
{
"issues": [
{
"key": "abc",
"fields": {
"issuetype": {
"name": "Bug",
"id": "1",
"subtask": false
},
"priority": {
"name": "Major",
"id": "3"
},
"created": "2020-5-11",
"status": {
"name": "OPEN"
}
}
},
{
"key": "def",
"fields": {
"issuetype": {
"name": "Info",
"id": "5",
"subtask": false
},
"priority": {
"name": "Minor",
"id": "2"
},
"created": "2020-5-8",
"status": {
"name": "DONE"
}
}
}
]}
I want to select the following:
[
"abc",
"Bug",
"Major",
"2020-5-11",
"OPEN",
"def",
"Info",
"Minor",
"2020-5-8",
"DONE"
]
The csv should look like that:
abc,Bug,Major,2020-5-11,OPEN
def,Info,Minor,2020-5-8,DONE
I tried $.issues.[*].[key,fields] and I get
"abc",
{
"issuetype": {
"name": "Bug",
"id": "1",
"subtask": false
},
"priority": {
"name": "Major",
"id": "3"
},
"created": "2020-5-11",
"status": {
"name": "OPEN"
}
},
"def",
{
"issuetype": {
"name": "Info",
"id": "5",
"subtask": false
},
"priority": {
"name": "Minor",
"id": "2"
},
"created": "2020-5-8",
"status": {
"name": "DONE"
}
}
]
But when I want to select e.g. only "created" $.issues.[*].[key,fields.[created]
[
"2020-5-11",
"2020-5-8"
]
This is the result.
But I just do not get how to select "key" and e.g. "name" in the field issuetype.
How do I do that with jsonPath or is there a better way to filter a jsonfile and then convert it into a csv?
I recommend what I believe is a better way - which is to create a set of Java classes which represent the structure of your JSON data. When you read the JSON into these classes, you can manipulate the data using standard Java.
I also recommend a different JSON parser - in this case Jackson, but there are others. Why? Mainly, familiarity - see later on for more notes on that.
Starting with the end result: Assuming I have a class called Container which contains all the issues listed in the JSON file, I can then populate it with the following:
//import com.fasterxml.jackson.databind.ObjectMapper;
String jsonString = "{...}" // your JSON data as a string, for this demo.
ObjectMapper objectMapper = new ObjectMapper();
Container container = objectMapper.readValue(jsonString, Container.class);
Now I can print out all the issues in the CSV format you want as follows:
container.getIssues().forEach((issue) -> {
printCsvRow(issue);
});
Here, the printCsvRow() method looks like this:
private void printCsvRow(Issue issue) {
String key = issue.getKey();
Fields fields = issue.getFields();
String type = fields.getIssuetype().getName();
String priority = fields.getPriority().getName();
String created = fields.getCreated();
String status = fields.getStatus().getName();
System.out.println(String.join(",", key, type, priority, created, status));
}
In reality, I would use a CSV library to ensure records are formatted correctly - the above is just for illustration, to show how the JSON data can be accessed.
The following is printed:
abc,Bug,Major,2020-5-11,OPEN
def,Info,Minor,2020-5-8,DONE
And to filter only OPEN records, I can do something like this:
container.getIssues()
.stream()
.filter(issue -> issue.getFields().getStatus().getName().equals("OPEN"))
.forEach((issue) -> {
printCsvRow(issue);
});
The following is printed:
abc,Bug,Major,2020-5-11,OPEN
To enable Jackson, I use Maven with the following dependency:
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.10.3</version>
</dependency>
In case you don't use Maven, this gives me 3 JARs: jackson-databind, jackson-annotations, and jackson-core.
To create the nested Java classes I need (to mirror the structure of the JSON), I use a tool which generates them for me using your sample JSON.
In my case, I used this tool, but there are others.
I chose "Container" as the name of the root Java class; a source type of JSON; and selected Jackson 2.x annotations. I also requested getters and setters.
I added the generated classes (Fields, Issue, Issuetype, Priority, Status, and Container) to my project.
WARNING: The completeness of these Java classes is only as good as the sample JSON. But you can, of course, enhance these classes to more accurately reflect the actual JSON you need to handle.
The Jackson ObjectMapper takes care of loading the JSON into the class structure.
I chose to use Jackson instead of JsonPath, simply because of familiarity. JsonPath appears to have very similar object mapping capabilities - but I have never used those features of JsonPath.
Final note: You can use xpath style predicates in JsonPath to access individual data items and groups of items - as you describe in your question. But (in my experience) it is almost always worth the extra effort to create Java classes, if you want to process all your data in more flexible ways - especially if that involves transforming the JSON input into different output structures.

How to send multiple JSON in single request(Jmeter)

Though I could see this question might be repeated but couldn't find any similar solution for the below JSON strut. Pls suggest.
I have excel sheet where the data's in columns look like :
CSV file data
My expected JSON as:
{
"Child ": {
"10"
: { "Post": { "Kid-R":1 },
"Var": [1,1 ],
"Tar": [2,2],
"Fur": [3,3]},
"11":
{"Post": {"Kid-R":2 },
"Var": [1,1 ],
"Tar": [2,2 ],
"Fur": [5,4 ]}
},
"Clone": [],
"Birth": 2,
"TT": 11,
"Clock": ${__time(/1000,)}
}
I have tried incorporating beanshell preprocessor in JMeter & tried below code:
def builder = new groovy.json.JsonBuilder()
#groovy.transform.Immutable
class Child {
String post
String var
String Tar
String Fur
}
def villas = new File("Audit_27.csv")
.readLines()
.collect { line ->
new child (line.split(",")[1],(line.split(",")
[2]+","+line.split(",")[3]),(line.split(",")[4]+","+line.split(",")
[5]),(line.split(",")[6]+","+line.split(",")[7]))}
builder(
Child :villas.collect(),
"Clone": [],
"Birth": 2,
"TT": 11,
"Clock": ${__time(/1000,)}
)
log.info(builder.toPrettyString())
vars.put("payload", builder.toPrettyString())
And I could see below response only:
Note: I dont know how to declare "Key" value (line.split(",")[0]) in the above solution.
{
"Child": [
{
"post": "\"\"\"Kid-R\"\":1\"",
"var": "\"[2,2]\"",
"Tar": "\"[1,1]\"",
"Fur": "\"[3,3]\""
},
{
"post": "\"\"\"Kid-R\"\":2\"",
"var": "\"[2,2]\"",
"Tar": "\"[1,1]\"",
"Fur": "\"[3,3]\""
}
],
"Clone": [],
"Birth": 2,
"TT": 11,
"CLock": 1585219797
}
Any help would be greatly appreciated
You're copying and pasting the solution from this answer without understanding what you're doing.
If you change class name from VILLA to own you need to use new own instead of new VILLA
Also this line won't compile: Clock: <take system current time> you need to use System.currentTimeMillis() or appropriate function of the Date class in order to generate the timestamp.
If you want a comprehensive answer, you need to provide:
Well-formatted CSV file
Valid JSON payload
In the meantime I would recommend getting familiarized with the following material:
Apache Groovy: Parsing and producing JSON
Apache Groovy - Why and How You Should Use It
Reading a File in Groovy
Actually I am gonna follow DmirtiT suggestions, as mentioned in some of post to use random variable for bulk API request. Same answer it helped me here as well to generate multiple JSON structure with unique data. Thanks..

How to tell jackson to serialize to json without attributes prefix

In my Java IDE, I tell java that I use prefix for my fields.
The result is that during a Java to Json serialization, I get all my attributes prefixed with an _, is there a simple way to do so ?
Actual
{
"_creation": {
"_dateTime": "2016-08-16T11:13:09.000Z",
"_personId": 1
},
"_description": null,
"_firstName": "Jason",
"_id": 700,
"_lastName": "Stateman",
"_modification": {
"_dateTime": "2016-08-16T11:13:24.000Z",
"_personId": null
}
}
Wanted
{
"creation": {
"dateTime": "2016-08-16T11:13:09.000Z",
"personId": 1
},
"description": null,
"firstName": "Jason",
"id": 700,
"lastName": "Stateman",
"modification": {
"dateTime": "2016-08-16T11:13:24.000Z",
"personId": null
}
}
If you are using FasterXML to serialize your objects, you could add the #JsonProperty annotation to your class attributes to control the serialized attribute name. See the documentation at : https://github.com/FasterXML/jackson-annotations/wiki/Jackson-Annotations#property-naming

JSON File Parsing procedure for those field that not has in every array

Like, I have a json file
"ref": [{
"af": [
1
],
"speaker": true,
"name": "Fahim"
},
{
"aff": [
1
],
"name": "Grewe"
}]
During parsing time, If a field is not available in every array(like here speaker). It should throw Null Pointer Exception. So, what are the procedure for parsing those field that not has in every array.
A nice JSON parsing library like this one will have different levels of validation :
https://code.google.com/p/quick-json/
you can set custom validation rules, or use a non-validating version which will just parse without checking standards etc.
Have you tried:
var ref = YourObject.ref;
for(var i=0; i<ref.length; i++){
if(ref[i].speaker!==null){
//do something
}
}

Categories