MongoDB query and select inner object - java

I want to query for inner object and select only filtered inner objects from mongoddb document.
Consider below mongodb document.
{
"schools": [
{
"name": "ABC",
"students": [
{
"name": "ABC 1",
"class": 1
},
{
"name": "ABC 2",
"class": 2
},
{
"name": "ABC 3",
"class": 1
}
]
},
{
"name": "XYZ",
"students": [
{
"name": "XYZ 1",
"class": 1
},
{
"name": "XYZ 2",
"class": 2
}
]
}
]
}
I want to select only students in class 1.
expected result json as below.
{
"school": {
"name": "ABC",
"students": [
{
"name": "ABC 1",
"class": 1
},
{
"name": "ABC 3",
"class": 1
}
]
},
"school": {
"name": "XYZ",
"students": [
{
"name": "XYZ 1",
"class": 1
}
]
}
}
Even below result is fine with me.
{
"students": [
{
"name": "ABC 1",
"class": 1
},
{
"name": "ABC 3",
"class": 1
},
{
"name": "XYZ 1",
"class": 1
}
]
}
Please help me to get this done.
Really helpful if can provide mongodb query.
I am using mongodb with spring data in my application.

You can search for mongo db array nested record search. The example code is here. Document is here.
db.your_collection_name.find({'school.student.class':1})
If you only want students do flatmap for your results. Here is document for flatmap in mongodb

Finally I could be able to find the query.
First I have to unwind and apply matching criteria. this was work for me.
db.{mycollection}.aggregate(
[
{ $unwind: '$schools.students'},
{ $match : { "schools.students.class" : 1 } },
{ $project : { "schools.name" : 1, 'schools.students' : 1 } }
]
);

Related

How to get json key values by another key value

I have a JSON output like this:
{
"items": [
{
"id": "1",
"name": "Anna",
"values": [
{
"code": "Latin",
"grade": 1
},
{
"code": "Maths",
"grade": 5
}
]
},
{
"id": "2",
"name": "Mark",
"values": [
{
"code": "Latin",
"grade": 5
},
{
"code": "Maths",
"grade": 5
}
]
}
]
}
I need to get field values for "name": "Anna". I am getting RestAssured Response and would like to use my beans to do that, but I can also use jsonPath() or jsonObject(), but I don't know how. I searched many topics but did not find anything.

How to split JSON into Dataset rows?

I have the following JSON input data:
{
"lib": [
{
"id": "a1",
"type": "push",
"icons": [
{
"iId": "111"
}
],
"id": "a2",
"type": "pull",
"icons": [
{
"iId": "111"
},
{
"iId": "222"
}
]
}
]
I want to get the following Dataset:
id type iId
a1 push 111
a2 pull 111
a2 pull 222
How can I do it?
This is my current code. I use Spark 2.3 and Java 1.8:
ds = spark
.read()
.option("multiLine", true).option("mode", "PERMISSIVE")
.json(jsonFilePath);
ds = ds
.select(org.apache.spark.sql.functions.explode(ds.col("lib.icons")).as("icons"));
However the result is wrong:
+---------------+
| icons|
+---------------+
| [[111]]|
|[[111], [222...|
+---------------+
How can I get the correct Dataset?
UPDATE:
I tries this code, but it generates some extra combinations of id, type and iId that do not exist in the input file.
ds = ds
.withColumn("icons", org.apache.spark.sql.functions.explode(ds.col("lib.icons")))
.withColumn("id", org.apache.spark.sql.functions.explode(ds.col("lib.id")))
.withColumn("type", org.apache.spark.sql.functions.explode(ds.col("lib.type")));
ds = ds.withColumn("its", org.apache.spark.sql.functions.explode(ds.col("icons")));
As already pointed out, the JSON String seems to be malformed. with the updated one, you can use the following to get result you wanted:
import org.apache.spark.sql.functions._
spark.read
.format("json")
.load("in/test.json")
.select(explode($"lib").alias("result"))
.select($"result.id", $"result.type", explode($"result.icons").alias("iId"))
.select($"id", $"type", $"iId.iId")
.show
Your JSON appears to be malformed. Fixing the indenting makes this slightly more apparent:
{
"lib": [
{
"id": "a1",
"type": "push",
"icons": [
{
"iId": "111"
}
],
"id": "a2",
"type": "pull",
"icons": [
{
"iId": "111"
},
{
"iId": "222"
}
]
}
]
Does your code work correctly if you feed it this JSON instead?
{
"lib": [
{
"id": "a1",
"type": "push",
"icons": [
{
"iId": "111"
}
]
},
{
"id": "a2",
"type": "pull",
"icons": [
{
"iId": "111"
},
{
"iId": "222"
}
]
}
]
}
Note the inserted }, { just before "id": "a2" to break the object with duplicate keys into two, and the closing } at the very end which had previously been omitted.

MongoDB Java Driver aggregation with regex filter

I am using MongoDB Java Driver 3.6.3.
I want to create regex query with group by aggregation to retrieve distinct values.
Let's say I have json:
[{
"name": "John Snow",
"category": 1
},
{
"name": "Jason Statham",
"category": 2
},
{
"name": "John Lennon",
"category": 2
},
{
"name": "John Snow",
"category": 3
}]
I want to create query where regex is like "John.*" and group it by name so there would be only one "John Snow"
Expected result is:
[{
"name": "John Snow",
"category": 1
},
{
"name": "John Lennon",
"category": 2
}]
The answer provided by felix is correct, in terms of Mongo Shell commands. The equivalent expression of that command using the MongoDB Java driver is:
MongoClient mongoClient = ...;
MongoCollection<Document> collection = mongoClient.getDatabase("...").getCollection("...");
AggregateIterable<Document> documents = collection.aggregate(Arrays.asList(
// Java equivalent of the $match stage
Aggregates.match(Filters.regex("name", "John")),
// Java equivalent of the $group stage
Aggregates.group("$name", Accumulators.first("category", "$category"))
));
for (Document document : documents) {
System.out.println(document.toJson());
}
The above code will print out:
{ "_id" : "John Lennon", "category" : 2 }
{ "_id" : "John Snow", "category" : 1 }
You can achieve this with a $regex in $match stage, followed by a $group stage:
db.collection.aggregate([{
"$match": {
"name": {
"$regex": "john",
"$options": "i"
}
}
}, {
"$group": {
"_id": "$name",
"category": {
"$first": "$category"
}
}
}])
output:
[
{
"_id": "John Lennon",
"category": 2
},
{
"_id": "John Snow",
"category": 1
}
]
you can try it here: mongoplayground.net/p/evw6DP_574r
You can use Spring Data Mongo
like this
Aggregation agg = Aggregation.newAggregation(
ggregation.match(ctr.orOperator(Criteria.where("name").regex("john", "i")),
Aggregation.group("name", "category")
);
AggregationResults<CatalogNoArray> aggResults = mongoTemp.aggregate(agg, "demo",demo.class);

How to index a Json object with object and its reference in elasticsearch?

I am working with Elasticsearch recently, and I meet a problem that don't know how to solve it.
I have a Json like:
{
"objects": [
"object1": {
"id" : "12345",
"name":"abc"
},
"12345"
]
}
Object2 is a reference of object1, when I trying to saving(or called indexing) into elastic search, it says:
"org.elasticsearch.index.mapper.MapperParsingException: failed to parse"
After I google I found that because object1 is an object, but object 2 is considered as a string.
We cannot change our json in our project, so in this case how can I save it in the elasticsearch?
Thanks for any help and suggestion.
How do you do that?
I run this command and it works.
PUT test/t1/1
{
"objects": {
"object1": {
"id" : "12345",
"name":"abc"
},
"object2": "12345"
}
}
and the result is:
{
"_index": "test",
"_type": "t1",
"_id": "1",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"created": true
}
UPDATE 1
Depending on your requirements one of these may solve your problem:
PUT test/t1/2
{
"objects": [
{
"object1": {
"id": "12345",
"name": "abc"
}
},
{
"object2": "12345"
}
]
}
PUT test/t1/2
{
"objects": [
{
"object1": {
"id": "12345",
"name": "abc"
},
"object2": "12345"
},
{
...
}
]
}

How to count by attribute in JSON?

I have the following JSON:
{
"items": [
{
"id": "1",
"name": "John",
"location": {
"town": {
"id": "10"
},
"address": "600 Fake Street",
},
"creation_date": "2010-01-19",
"last_modified_date": "2017-05-18"
},
{
"id": "2",
"name": "Sarah",
"location": {
"town": {
"id": "10"
},
"address": "76 Evergreen Street",
},
"creation_date": "2010-01-19",
"last_modified_date": "2017-05-18"
},
{
"id": "3",
"name": "Hamed",
"location": {
"town": {
"id": "20"
},
"address": "50 East A Street",
},
"creation_date": "2010-01-19",
"last_modified_date": "2017-05-18"
}
]
}
And I need to get something like this, count how many times each townId appears:
[ { "10": 2 }, {"20": 1 }]
I'm trying to find the most eficient way to do this. Any idea?
Most efficient way is to load the String in a StringBuilder and remove all line breaks and white spaces. Then search for index of "town":{"id":" string (town start index) and then search for the end index (String `"}'). Using the 2 indexes you can extract town ids and count them.
No need to deserialize the JSON into POJO objects:) and extract values by xpath from the POJOs.

Categories