Json Array Flattening with Jackson / Parsing Performance - java

I have a JSON like this below:
My aimed POJO is
[{
"id": "1",
"teams": [{
"name": "barca"
},
{
"name": "real"
}
]
},
{
"id": "2"
},
{
"id": "3",
"teams": [{
"name": "atletico"
},
{
"name": "cz"
}
]
}
]
My aimed POJO is
class Team
int id;
String name;
Meaning, for each "team" I want to create a new object. Like;
new Team(1,barca)
new Team(1,real)
new Team(2,null)
new Team(3,atletico)
...
Which I believe I did with custom deserializer like below:
JsonNode rootArray = jsonParser.readValueAsTree();
for (JsonNode root : rootArray) {
String id = root.get("id").toString();
JsonNode teamsNodeArray = root.get("teams");
if (teamsNodeArray != null) {
for (JsonNode teamNode: teamsNodeArray ) {
String nameString = teamNode.get("name").toString();
teamList.add(new Team(id, nameString));
}
} else {
teamList.add(new Team(id, null));
}
}
Condidering I am getting 750k records... having 2 fors is I believe making the code way slower than it should be. It takes ~7min.
My question is, could you please enlighten me if there is any better way to do this?
PS: I have checked many stackoverflow threads for this, could not find anything that fits so far.
Thank you in advance.

Do not parse the data yourself, use automatic de/serialization whenever possible.
Using jackson it could be as simple as:
MyData myData = new ObjectMapper().readValue(rawData, MyData.class);
For you specific example, we generate a really big instance (10M rows):
$ head big.json
[{"id": 1593, "group": "6141", "teams": [{"id": 10502, "name": "10680"}, {"id": 16435, "name": "18351"}]}
,{"id": 28478, "group": "3142", "teams": [{"id": 30951, "name": "3839"}, {"id": 25310, "name": "19839"}]}
,{"id": 29810, "group": "8889", "teams": [{"id": 5586, "name": "8825"}, {"id": 27202, "name": "7335"}]}
...
$ wc -l big.json
10000000 big.json
Then, define classes matching your data model (e.g.):
public static class Team {
public int id;
public String name;
}
public static class Group {
public int id;
public String group;
public List<Team> teams;
}
Now you can read directly the data by simply:
List<Group> xs = new ObjectMapper()
.readValue(
new File(".../big.json"),
new TypeReference<List<Group>>() {});
A complete code could be:
public static void main(String... args) throws IOException {
long t0 = System.currentTimeMillis();
List<Group> xs = new ObjectMapper().readValue(new File("/home/josejuan/tmp/1/big.json"), new TypeReference<List<Group>>() {});
long t1 = System.currentTimeMillis();
// test: add all group id
long groupIds = xs.stream().mapToLong(x -> x.id).sum();
long t2 = System.currentTimeMillis();
System.out.printf("Group id sum := %d, Read time := %d mS, Sum time = %d mS%n", groupIds, t1 - t0, t2 - t1);
}
With output:
Group id sum := 163827035542, Read time := 10710 mS, Sum time = 74 mS
Only 11 seconds to parse 10M rows.
To check data and compare performance, we can read directly from disk:
$ perl -n -e 'print "$1\n" if /"id": ([0-9]+), "group/' big.json | time awk '{s+=$1}END{print s}'
163827035542
4.96user
Using 5 seconds (the Java code is only half as slow).
The non-performance problem of processing the data can be solved in many ways depending on how you want to use the information. For example, grouping all the teams can be done:
List<Team> teams = xs.stream()
.flatMap(x -> x.teams.stream())
.collect(toList());
Map<Integer, Team> uniqTeams = xs.stream()
.flatMap(x -> x.teams.stream())
.collect(toMap(
x -> x.id,
x -> x,
(a, b) -> a));

Related

How to sort JsonNode array List

I am trying to sort below array node within my json payload. Should be sort by name element
Before:
"empData": [
{
"name": "Jhon",
"age": 33
},
{
"name": "Sam",
"age": 24
},
{
"name": "Mike",
"age": 65
},
{
"name": "Jenny",
"age": 33
}
]
Expected:
"empData": [
{
"name": "Jenny",
"age": 33
},
{
"name": "Jhon",
"age": 33
},
{
"name": "Mike",
"age": 65
},
{
"name": "Sam",
"age": 24
}
]
I was trying below option:
private static final ObjectMapper SORTED_MAPPER = new ObjectMapper();
static {
SORTED_MAPPER.configure(SerializationFeature.ORDER_MAP_ENTRIES_BY_KEYS, true);
}
public static JsonNode sortJsonArrayList(final JsonNode node) throws IOException {
final Object obj = SORTED_MAPPER.treeToValue(node, Object.class);
final String json = SORTED_MAPPER.writeValueAsString(obj);
return objectMapper.readTree(json);
}
But not sure how to select name key for sorting.
Try something like this:
public JsonNode some(JsonNode node){
//find list of objects that contains field name
List<JsonNode> dataNodes = node.findParents("name");
//sort it
List<JsonNode> sortedDataNodes = empDataNodes
.stream()
.sorted(Comparator.comparing(o -> o.get("name").asText()))
.collect(Collectors.toList());
//return the same Json structure as in method parameter
ArrayNode arrayNode = objectMapper.createObjectNode().arrayNode().addAll(sortedEmpDataNodes);
return objectMapper.createObjectNode().set("empData", arrayNode);
}
You can debug it step by step and see how it works.
ORDER_MAP_ENTRIES_BY_KEYS is the feature which enables Map sorting by Map keys. As your result type is not a Map this sorting will not be applied. To sort array you can create custom deserialiser or sort deserialised array in place. This answer describes how this can be achieved.

Group JSON based on ID [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
i have a json
{
"content": [
{
"idnumber": "666",
"name": "mark",
"type": "band",
"tools": [
{
"idtools": "5657",
"blabla": null,
"blabla": false,
}
]
},
{
"idnumber": "666",
"name": "mark",
"type": "band",
"tools": [
{
"idtools": "5658",
"blabla": null,
"blabla": false
}
]
}
]
}
inside content array, i have 2 json. i want to change my json into this, because they have same id number.
{
"content": [
{
"idnumber": "666",
"name": "mark",
"type": "band",
"tools": [
{
"idtools": "5657",
"blabla": null,
"blabla": false,
},
{
"idtools": "5658",
"blabla": null,
"blabla": false
}
]
}
]
}
how to do that using distinct or filter?
i tried to distinct it and map it but still have error.
Assuming the following objects that match your JSON structure (for sake of brevity, I use Lombok):
#Data
#AllArgsConstructor
#NoArgsConstructor
class Content {
int idNumber;
String name;
String type;
List<Tool> tools;
}
#Data
#AllArgsConstructor
#NoArgsConstructor
class Tool {
int idTools;
String blabla;
}
You can use the Stream API with groupingBy by the id and reduce the values into a single one.
List<Content> mergedContents = contents.stream()
.collect(Collectors.groupingBy(Content::getIdNumber))
.values()
.stream()
.reduce(
new ArrayList<>(), // mutable List
(left, right) -> {
Content content = right.get(0); // they are same (by id)
List<Tool> tools = right.stream() // from each new list
.flatMap(c -> c.getTools().stream()) // .. flatmap the tools
.collect(Collectors.toList()); // .. and extract to a list
content.setTools(tools); // set the List<Tool>
left.add(content); // add the Content
return left; // return the left list
},
(left, right) -> left); // only for parallel Stream
The resulting structure comming from Collectors.groupingBy(Content::getIdNumber) is Map<Integer, List<Content>>. The subsequent mutable reduction on the map values (Collection<List<Content>>) merges each List<Content> with identical Content.id into a single Content with flatmapped List<Tools>. The List with a these modified Content is returned as a result of the reduction.
Sample data
List<Content> contents = new ArrayList<>();
contents.add(new Content(666, "Mark", "Band",
Collections.singletonList(new Tool(5657, null))));
contents.add(new Content(666, "Mark", "Band",
Collections.singletonList(new Tool(5658, null))));
List<Content> mergedContents = /* solution */
mergedContents.forEach(System.out::println);
Main.Content(idNumber=666, name=Mark, type=Band, tools=[Main.Tool(idTools=5657, blabla=null), Main.Tool(idTools=5658, blabla=null)])
This is equal to what your JSON samples.

Load mathematical formulae dynamically and evaluating values using Script Engine

I need to implement a feature for a program that calculates metrics based on predefined measures with the requirement that the addition of a new metric (and/or measure) should require minimal code level changes. This is how I have done it:
class Measure { // clas to hold the measures
Integer id;
String name;
Long value;
// necessary getters and setters
}
These measures are retrieved from a MySQL DB.
// the measures stored in an ArrayList of objects of the above Measure class
[
{
"id": 1,
"name": "Displacement",
"value": 200
},
{
"id": 2,
"name":"Time",
"value": 120
},
{
"id":3,
"name":"Mass",
"value": 233
},
{
"id":4,
"name": "Acceleration",
"value": 9.81
},
{
"id": 5,
"name":"Speed of Light",
"value": 300000000
}
]
I need to get metrics such as the following:
Velocity (Displacement/Time), Force (Mass * Acceleration) and Energy (Mass * Speed of Light^2)
I implemented the following JSON in which I have defined the above formulae:
[
{
"title": "Velocity",
"unit": "m/s",
"formula": "( displacement / time )"
},
{
"title": "Force",
"unit": "N",
"formula": "( mass * acceleration )"
},
{
"title": "Energy",
"unit": "J",
"formula": "( mass * speed_of_light * speed_of_light )"
}
]
I then calculate metrics in the following way:
class Evaluator {
ScriptEngine engine = new ScriptEngineManager().getEngineByExtension("js");
public Evaluator(List<Measure> measures) {
measures.forEach(measure -> {
String fieldName = measure.getName().replace(" ", "_").toLowerCase(); // convert Speed of Light -> speed_of_light etc
engine.put(fieldName, Double.parseDouble(measure.getValue()));
})
}
public void formula(String formula) throws Exception {
engine.eval("function calculateMetric() { return " + formula +" }");
}
public Object evaluate() throws ScriptException {
return engine.eval("calculateMetric()");
}
}
The above class is used to load each formula into the Script Engine and then calculate the metric value based on the formula provided.
// load JSON file into JsonArray
JsonArray formulae = parseIntoJsonArray(FileUtils.getContent("formulae.json"));
Evaluator evaluator = new Evaluator(measures);
for (Object formula : formulae) {
JsonObject jsonObj = (JsonObject) formula;
evaluator.formula(jsonObj.get("formula").getAsString());
Double metricVal = Double.parseDouble(evaluator.evaluate().toString());
// do stuff with value
}
The code works as expected. I want to know if I am doing anything wrong that could affect the program in the long run/if anything is not best practice/if there is a better way/easier way to accomplish the same.
This question is closely related to a question I posted yesterday. Sort of a second part.

JSON to Java: How to model lists of objects into generic model

I'm making a spreadSheet using SpreadJS, and I should be able to to add, delete and change the value of a key nested inside many objects. Here is how my json is formatted:
{
"version": "10.0.0",
"sheets": {
"Sheet1": {
"name": "Sheet1",
"data": {
"dataTable": {
"0": {
"0": {
"value": 129
}
}
}
},
"selections": {
"0": {
"row": 0,
"rowCount": 1,
"col": 0,
"colCount": 1
},
"length": 1
},
"theme": "Office",
"index": 0
}
}
}
The data represents, say, the value of each cell in the spreadSheet [0,0], [0,1], [1,1] ...etc. I want to parse this data into a List of generic model, for the field dataTable i would like to represent it like this: Map<Integer, Map<Integer, ValueObj>> for example in this case <0, <0, 129>> but i didn 't find how to do that and how my model would likely be.
I am new to JSON any help is appreciated! Thanks
Then to handle data, you can have a generic class like :
class CellData<T> {
T data;
}
Then read as below :
String jsonInput = "{ \"0\": { \"0\": { \"value\": 129 } } }";
ObjectMapper mapper = new ObjectMapper();
TypeReference<HashMap<Integer,HashMap<Integer,CellData<Integer>>>> typeRef =
new TypeReference<HashMap<Integer, HashMap<Integer, CellData<Integer>>>>() {};
Map<Integer, Map<Integer, CellData<Integer>>> map = mapper.readValue(jsonInput, typeRef);

how to deserilize JSON with dynamic numeric field names, using GSON?

I have a JSON file in this format:
title 1
{
"0" : 2,
"1" : 5,
"2" : 8
}
title 2
{
"1" : 44,
"2" : 15,
"3" : 73,
"4" : 41
}
As you can see the indexes are dynamic - in title 1 they were: "0","1","2" and for title 2 they are: "1","2","3","4"
I don't know how to read this using GSON.
I need to somehow convert it into a java object so I can go on and process the data.
Any help is most welcome.
First thing is the JSON that is represented on the page is not valid JSON invalid, so my recommendation is based on the fallowing JSON. Just giving you full disclosure.
{
"title 1":{
"0":2,
"1":5,
"2":8
},
"title 2":{
"1":44,
"2":15,
"3":73,
"4":41
}
}
OPTION 1 (not how i would solve this issue)
This would serialize it as a general object that you could loop threw to process.
new Gson().fromJson(yourJsonString,Object.class);
OPTION 2 (best bet in my opinion)
If you had control of how the object cam in i would do something like this
{
"listOfTitles":[
{
"title":[
{
"key":"0",
"value":1234
},
{
"key":"1",
"value":12341234
},
{
"key":"2",
"value":123412341234
}
],
"titleName":"title 1"
},
{
"title":[
{
"key":"0",
"value":12341
},
{
"key":"1",
"value":123412
},
{
"key":"2",
"value":12
},
{
"key":"3",
"value":12341
}
],
"titleName":"title 2"
}
]
}
This would allow for you to build an object like...
public class YouObjectName{
private ArrayList<Title> listOfTitles;
private String titleName;
//constructor
//getters and setters
}
public class Title{
private String key;
private Integer value;
//constructor
//getters and setters
}
I would consume this with GSON like
new Gson().fromJson(jsonString,YouObjectName.class);
Hope that helps a little.

Categories