Java Stream to aggregate list of JSONs into groups - java

I have list of objects in java, like this.
[
{
"applicationNumber": "100400",
"users": "A",
"category": "student"
},
{
"applicationNumber": "100400",
"users":"B",
"category": "student"
},
{
"applicationNumber": "100400",
"users":"C",
"category": "neighbour"
},
{
"applicationNumber": "100400",
"users": "D",
"category": "neighbour"
},
{
"applicationNumber": "200543",
"users": "C",
"category": "student"
},
{
"applicationNumber": "200543",
"users": "A",
"category": "student"
},
{
"applicationNumber": "200543",
"users":"D",
"category": "friend"
}
]
I want to group users as list (order does not matter) for each category for every applicationNumber. Can refer below json to get the idea.
[
{
"applicationNumber": "100400",
"users": [
"A",
"B"
],
"category": "student"
},
{
"applicationNumber": "100400",
"users": [
"C",
"D"
],
"category": "neighbour"
},
{
"applicationNumber": "200543",
"users": [
"C",
"A"
],
"category": "student"
},
{
"applicationNumber": "200543",
"users": [
"D"
],
"category": "friend"
}
]
I am able to this using a for loop, HashMap and if else conditions. I want to use Java 8 stream to achieve the same . Can anyone help me , I am new to java.
PS: Thank you in advance

I think using streams here is a little bit overengineering but you can to that in two steps. First you need to use Collectors.groupingBy() to group yours pojos into map of lists. Next you need to reduce each list to a single value by using stream().reduce().
ObjectMapper mapper = new ObjectMapper()
.enable(DeserializationFeature.ACCEPT_SINGLE_VALUE_AS_ARRAY);
List<Application> applications = Arrays.asList(mapper.readValue(json, Application[].class));
List<Application> groupedApplications = applications.stream()
.collect(Collectors.groupingBy(ApplicationKey::of, Collectors.toList()))
.values().stream()
.map(apps -> apps.stream().reduce(Application::merge))
.filter(Optional::isPresent)
.map(Optional::get)
.collect(Collectors.toList());
Application.java:
public class Application {
private String applicationNumber;
private String category;
private List<String> users = new ArrayList<>();
public static Application merge(Application first, Application second) {
assert ApplicationKey.of(first).equals(ApplicationKey.of(second));
Application merged = new Application(first.applicationNumber, first.category, first.getUsers());
merged.users.addAll(second.getUsers());
return merged;
}
//constructor, getters, setters
}
ApplicationKey.java
public class ApplicationKey {
private String applicationNumber;
private String category;
public static ApplicationKey of(Application application) {
return new ApplicationKey(application.getApplicationNumber(), application.getCategory());
}
public ApplicationKey(String applicationNumber, String category) {
this.applicationNumber = applicationNumber;
this.category = category;
}
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
ApplicationKey that = (ApplicationKey) o;
return Objects.equals(applicationNumber, that.applicationNumber) &&
Objects.equals(category, that.category);
}
#Override
public int hashCode() {
return Objects.hash(applicationNumber, category);
}
//getters, setters
}

Related

How to Convert Tree Structure List to Json

I have query result like this :
DEPT_NAME
DEPT_R_IDX
DEPT_IDX
DEPT_LEVEL
root
0
1
0
dept_0
1
2
1
dept_1
1
3
1
dept_1_0
3
4
2
dept_1_1
3
5
2
dept_2
1
6
1
dept_2_0
6
7
2
dept_2_0_1
7
8
3
DEPT_IDX is PRIMARY KEY, DEPT_LEVEL is TREE DEPTH, DEPT_R_IDX is parent's DEPT_IDX
I stored this data in Java's List.
List<HashMap<String, String>> DEPT_LIST;
I want convert this List to Json like this :
[
{
"title": "root",
"key": "1",
"expanded": true,
"folder": true,
"children": [
{
"key": "2",
"title": "dept_0",
"expanded": true,
"folder": true
},
{
"key": "3",
"title": "dept_1",
"expanded": true,
"folder": true,
"children": [
{
"key": "4",
"title": "dept_1_0",
"expanded": true,
"folder": true
},
{
"key": "5",
"title": "dept_1_1",
"expanded": true,
"folder": true
}
]
},
{
"key": "6",
"title": "dept_2",
"expanded": true,
"folder": true,
"children": [
{
"key": "7",
"title": "dept_2_0",
"expanded": true,
"folder": true,
"children": [
{
"key": "8",
"title": "dept_2_1",
"expanded": true,
"folder": true
}
]
}
]
}
]
}
]
result tree using this json data(using fancytree)
i tried this in browser side, but it's too low performance to make json structure
Instead of storing your data in a List<HashMap<String, String>>, store them in a List<Node> where class Node is something like this:
public class Node {
private String title;
private String key;
private boolean expanded;
private boolean folder;
private List<Node> children;
public String getTitle() {
return title;
}
public void setTitle(String title) {
this.title = title;
}
public String getKey() {
return key;
}
public void setKey(String key) {
this.key = key;
}
public boolean isExpanded() {
return expanded;
}
public void setExpanded(boolean expanded) {
this.expanded = expanded;
}
public boolean isFolder() {
return folder;
}
public void setFolder(boolean folder) {
this.folder = folder;
}
public List<Node> getChildren() {
return children;
}
public void setChildren(List<Node> children) {
this.children = children;
}
}
Then use Jackson or Gson to generate the corresponding Json.

Group two object attributes in a list

I would like to transform a duplicate object with different values to a list in java, as follows
[
{
"code": "code",
"active": true,
"car": "Sedan"
},
{
"code": "code",
"active": true,
"car": "R4"
},
{
"code": "code2",
"active": false,
"car": "Sedan"
},
{
"code": "code2",
"active": false,
"car": "R4"
}
]
ClassOne
public class Car{
private String code;
private boolean active;
private String car;
}
if "code" and "active" are the same, I would like to group them in a single object
[
{
"code": "code",
"active": true,
"name": {
"cars": [
{
"brand": "Sedan"
},
{
"brand": "R4"
}
]
}
},
{
"code": "code2",
"active": false,
"name": {
"cars": [
{
"brand": "Sedan"
},
{
"brand": "R4"
}
]
}
}
]
Class parse
public class CarParse{
private String code;
private String active;
private Name name;
}
public class Name{
private List<Brand> cars;
}
public class Brand{
private String brand;
}
Then it would be to go from ClassOne to ClassParse,transforming the objects grouped by "code" and "active
I find writing the non java-stream easier first.
For example, group the input objects on code-active will form a unique pair that you can loop over, then build the final list from
public List<CarParse> getParsed(List<Car> cars) {
Map<String, CarParse> codeMap = new HashMap<>();
for (Car c : cars) {
CarParse cp;
List<Brand> brands;
String code = c.getCode();
boolean active = c.isActive();
String group = String.format("%s-%s", code, active);
Brand b = new Brand(c.getCar());
if (codeMap.containsKey(group)) {
cp = codeMap.get(group);
brands = cp.getName().getCars();
brands.add(b);
} else {
brands = new ArrayList<>();
brands.add(b);
Name n = new Name(brands);
cp = new CarParse(code, active, n);
codeMap.put(group, cp);
}
}
return new ArrayList<>(codeMap.values());
}
Stream version
return new ArrayList<>(cars.stream().collect(Collectors.toMap(
c -> String.format("%s-%s", c.getCode(), c.isActive()),
c -> {
ArrayList<Brand> brands = new ArrayList<>();
brands.add(new Brand(c.getCar()));
return new CarParse(c.getCode(), c.isActive(), new Name(brands));
},
(v1, v2) -> {
if (v1.isActive() == v2.isActive() && (v1.getCode().equals(v2.getCode()))) {
for (Brand b : v2.getName().getCars()) {
v1.getName().getCars().add(b);
}
return v1;
}
return v1;
})).values());
Output
[ {
"code" : "code",
"active" : true,
"name" : {
"cars" : [ {
"brand" : "Sedan"
}, {
"brand" : "R4"
} ]
}
}, {
"code" : "code2",
"active" : false,
"name" : {
"cars" : [ {
"brand" : "Sedan"
}, {
"brand" : "R4"
} ]
}
} ]

Spark SQL nested arrays and beans support

Each hour I got some value updates as a new DataFrame. I have to reduce DataFrames in order to deduplicate entities and to track history of value updates. Because reduce logic is too complex, I'm converting DataFrames to JavaRDD, reducing and than converting JavaRDD back to DataFrame.
The issue is that I have to use nested data structures after reduce.
Question
I've read the inferring the schema using reflaction, but still it's not clear for me:
Does Spark SQL supports only nested arrays of primitives or nested arrays of beans too?
Why does Case 1 code doesn't work while Case 2 works?
Case 1
From the following code I got:
scala.MatchError: History(timestamp=1970-01-01 00:00:00.0,
value=10.0) (of class com.somepackage.History)
So I can conclude that Spark does not support nested array of beans. But see Case 2.
#Data
#NoArgsConstructor
#AllArgsConstructor
public static class Entity implements Serializable {
private Integer id;
private History[] history;
}
#Data
#NoArgsConstructor
#AllArgsConstructor
public static class History implements Serializable {
private Timestamp timestamp;
private Double value;
}
JavaRDD<Entity> rdd = JavaSparkContext
.fromSparkContext(spark().sparkContext())
.parallelize(asList(
new Entity(1, new History[] {
new History(new Timestamp(0L), 10.0)
})
));
spark()
//EXCEPTION HERE!
.createDataFrame(rdd, Entity.class)
.show();
Case 2
On the other hand, the next code works correct with nested arrays of beans:
Dataset<Entity> dataSet = spark()
.read()
.option("multiLine", true).option("mode", "PERMISSIVE")
.schema(fromJson("/data/testSchema.json"))
.json(getAbsoluteFilePath("data/testData.json"))
.as(Encoders.bean(Entity.class));
JavaRDD<Entity> rdd = dataSet
.toJavaRDD()
.mapToPair(o -> tuple(RowFactory.create(o.getId()), o))
.reduceByKey((o1, o2) -> o2)
.values()
.saveAsTextFile("output.json");
-------
private String getAbsoluteFilePath(String relativePath) {
return this
.getClass()
.getClassLoader()
.getResource("")
.getPath() + relativePath;
}
private StructType fromJson(String pathToSchema) {
return (StructType) StructType.fromJson(
new BufferedReader(
new InputStreamReader(
Resources.class.getResourceAsStream(pathToSchema)
)
)
.lines()
.collect(Collectors.joining(System.lineSeparator()))
);
}
testData.json
[
{
"id": 1,
"history": [
{
"timestamp": "2018-10-29 23:11:44.000",
"value": 12.5
}
]
},
{
"id": 1,
"history": [
{
"timestamp": "2018-10-30 14:43:05.000",
"value": 13.2
}
]
}
]
testSchema.json
{
"type": "struct",
"fields": [
{
"name": "id",
"type": "integer",
"nullable": true,
"metadata": {}
},
{
"name": "history",
"type": {
"type": "array",
"elementType": {
"type": "struct",
"fields": [
{
"name": "timestamp",
"type": "timestamp",
"nullable": true,
"metadata": {}
},
{
"name": "value",
"type": "double",
"nullable": true,
"metadata": {}
}
]
},
"containsNull": true
},
"nullable": true,
"metadata": {}
}
]
}

JSON is duplicated when writing from object to file

I am using Jackson to convert JSON into an object and vice versa. However, when writing back the object back as JSON it is duplicated, like so:
{
"Users": [
{
"name": "Steve",
"buckets": [
{
"bucketName": "stevesbucket",
"permissions": [
"CREATE",
"READ",
"UPDATE",
"DELETE"
],
"owner": "Steve"
},
{
"bucketName": "NEW BUCKET 2",
"permissions": [
"CREATE",
"READ",
"UPDATE",
"DELETE"
],
"owner": "Steve"
}
]
},
{
"name": "Jeff",
"buckets": [
{
"bucketName": "jeffsbucket",
"permissions": [
"CREATE",
"READ",
"UPDATE",
"DELETE"
],
"owner": "Jeff"
},
{
"bucketName": "stevesbucket",
"permissions": [
"READ"
],
"owner": "Steve"
}
]
}
],
"users": [
{
"name": "Steve",
"buckets": [
{
"bucketName": "stevesbucket",
"permissions": [
"CREATE",
"READ",
"UPDATE",
"DELETE"
],
"owner": "Steve"
},
{
"bucketName": "NEW BUCKET 2",
"permissions": [
"CREATE",
"READ",
"UPDATE",
"DELETE"
],
"owner": "Steve"
}
]
},
{
"name": "Jeff",
"buckets": [
{
"bucketName": "jeffsbucket",
"permissions": [
"CREATE",
"READ",
"UPDATE",
"DELETE"
],
"owner": "Jeff"
},
{
"bucketName": "stevesbucket",
"permissions": [
"READ"
],
"owner": "Steve"
}
]
}
]
}
Where there should only be one "Users" field. I have tried playing with the visibility settings of my object mapper with this:
ObjectMapper mapper = new ObjectMapper();
mapper.getSerializationConfig().getDefaultVisibilityChecker()
.withFieldVisibility(JsonAutoDetect.Visibility.ANY)
.withGetterVisibility(JsonAutoDetect.Visibility.NONE)
.withSetterVisibility(JsonAutoDetect.Visibility.NONE)
.withCreatorVisibility(JsonAutoDetect.Visibility.NONE);
However this hasn't made a difference. I think something in my users.java file may be causing an issue, as I have extra methods such as addBucket:
public static class Bucket
{
private String bucketName;
private String[] permissions;
private String owner;
public void setBucket(String bucketName, String[] permissions, String owner)
{
this.bucketName = bucketName;
this.permissions = permissions;
this.owner = owner;
}
public String getBucketName()
{
return bucketName;
}
public String[] getPermissions()
{
return permissions;
}
public String getOwner()
{
return owner;
}
}
public static class User
{
private String name;
private Bucket[] buckets;
public String getName()
{
return name;
}
public Bucket[] getBuckets()
{
return buckets;
}
#JsonIgnore
public void addBucket(String bucketName, String[] permissions, String owner)
{
Bucket[] temp = new Bucket[buckets.length+1];
for(int i = 0; i < buckets.length; ++i)
{
temp[i] = buckets[i];
}
temp[temp.length-1] = new Bucket();
temp[temp.length-1].setBucket(bucketName, permissions, owner);
buckets = temp;
}
}
public User[] Users;
public User[] getUsers()
{
return Users;
}
public void setUsers(User[] newUsers)
{
Users = newUsers;
}
Are there some properties I need to add to some things in users.java? Or are there other visibility settings I should be using with my ObjectMapper?
public User[] Users;
public User[] getUsers()
{
return Users;
}
Jackson is serializing the public member Users as "Users" and the getUsers function as "users". Users being private should fix this, and it is good practice for it to be so

Mapping JSONArray in RestTemplate Spring

I am trying to map this JSONArray using Spring RestTemplate:
[{
"Command": "/usr/sbin/sshd -D",
"Created": 1454501297,
"Id": "e00ca61f134090da461a3f39d47fc0cbeda77fbbc0610439d3c16a932686b612",
"Image": "ubuntu:latest",
"Labels": {
},
"Names": [
"/nova-c1896fbd-1309-4da2-8d77-b4fe4c02fa8e"
],
"Ports": [
],
"Status": "Up 2 hours"
}, {
"Command": "/usr/sbin/sshd -D",
"Created": 1450106126,
"Id": "7ffc9dbdd200e2c23adec442abd656ed57306955332697cb7da979f36ebf3b22",
"Image": "ubuntu:latest",
"Labels": {
},
"Names": [
"/nova-93b9ae40-8135-48b7-ac17-12094603b28c"
],
"Ports": [
],
"Status": "Up 2 hours"
}]
Here is ContainersInfo class:
#JsonIgnoreProperties(ignoreUnknown = true)
public class ContainersInfo {
private String Id;
private List<String> Names;
public String getId() {
return Id;
}
public void setId(String id) {
Id = id;
}
public List<String> getNames() {
return Names;
}
public void setNames(List<String> names) {
Names = names;
}
}
However I get null when I want to get the data:
ContainersInfo[] containers = syncRestTemplate.getForObject("http://192.168.1.2:4243/containers/json?all=1", ContainersInfo[].class);
for (int i = 0; i < containers.length; i++)
System.out.println("id:" + containers[i].getId());
The resulting output is as follows:
id:null
id:null
Any idea, what I should do?
Your JSON field names are in pascal case as opposed to camel case (which is usually the case). Set Jackson naming strategy to PascalCaseStrategy, i.e by adding #JsonNaming(PascalCaseStrategy.class) annotation into ContainersInfo class.

Categories