Two Aggregate Totals in One Group - java

I wrote a query in MongoDB as follows:
db.getCollection('student').aggregate(
[
{
$match: { "student_age" : { "$ne" : 15 } }
},
{
$group:
{
_id: "$student_name",
count: {$sum: 1},
sum1: {$sum: "$student_age"}
}
}
])
In others words, I want to fetch the count of students that aren't 15 years old and the summary of their age. The query works fine and I get two data items.
In my application, I want to do the query by Spring Data.
I wrote the following code:
Criteria where = Criteria.where("AGE").ne(15);
Aggregation aggregation = Aggregation.newAggregation(
Aggregation.match(where),
Aggregation.group().sum("student_age").as("totalAge"),
count().as("countOfStudentNot15YearsOld"));
When this code is run, the output query will be:
"aggregate" : "MyDocument", "pipeline" :
[ { "$match" { "AGE" : { "$ne" : 15 } } },
{ "$group" : { "_id" : null, "totalAge" : { "$sum" : "$student_age" } } },
{ "$count" : "countOfStudentNot15YearsOld" }],
"cursor" : { "batchSize" : 2147483647 }
Unfortunately, the result is only countOfStudentNot15YearsOld item.
I want to fetch the result like my native query.

If your're asking to return the grouping for both "15" and "not 15" as a result then you're looking for the $cond operator which will allow a "branching" based on conditional evaluation.
From the "shell" content you would use it like this:
db.getCollection('student').aggregate([
{ "$group": {
"_id": null,
"countFiteen": {
"$sum": {
"$cond": [{ "$eq": [ "$student_age", 15 ] }, 1, 0 ]
}
},
"countNotFifteen": {
"$sum": {
"$cond": [{ "$ne": [ "$student_age", 15 ] }, 1, 0 ]
}
},
"sumNotFifteen": {
"$sum": {
"$cond": [{ "$ne": [ "$student_age", 15 ] }, "$student_age", 0 ]
}
}
}}
])
So you use the $cond to perform a logical test, in this case whether the "student_age" in the current document being considered is 15 or not, then you can return a numerical value in response which is 1 here for "counting" or the actual field value when that is what you want to send to the accumulator instead. In short it's a "ternary" operator or if/then/else condition ( which in fact can be shown in the more expressive form with keys ) you can use to test a condition and decide what to return.
For the spring mongodb implementation you use ConditionalOperators.Cond to construct the same BSON expressions:
import org.springframework.data.mongodb.core.aggregation.*;
ConditionalOperators.Cond isFifteen = ConditionalOperators.when(new Criteria("student_age").is(15))
.then(1).otherwise(0);
ConditionalOperators.Cond notFifteen = ConditionalOperators.when(new Criteria("student_age").ne(15))
.then(1).otherwise(0);
ConditionalOperators.Cond sumNotFifteen = ConditionalOperators.when(new Criteria("student_age").ne(15))
.thenValueOf("student_age").otherwise(0);
GroupOperation groupStage = Aggregation.group()
.sum(isFifteen).as("countFifteen")
.sum(notFifteen).as("countNotFifteen")
.sum(sumNotFifteen).as("sumNotFifteen");
Aggregation aggregation = Aggregation.newAggregation(groupStage);
So basically you just extend off of that logic, using .then() for a "constant" value such as 1 for the "counts", and .thenValueOf() where you actually need the "value" of a field from the document, so basically equal to the "$student_age" as shown for the common shell notation.
Since ConditionalOperators.Cond shares the AggregationExpression interface, this can be used with .sum() in the form that accepts an AggregationExpression as opposed to a string. This is an improvement on past releases of spring mongo which would require you to perform a $project stage so there were actual document properties for the evaluated expression prior to performing a $group.
If all you want is to replicate the original query for spring mongodb, then your mistake was using the $count aggregation stage rather than appending to the group():
Criteria where = Criteria.where("AGE").ne(15);
Aggregation aggregation = Aggregation.newAggregation(
Aggregation.match(where),
Aggregation.group()
.sum("student_age").as("totalAge")
.count().as("countOfStudentNot15YearsOld")
);

Related

MongoDB - Update parts of object

I have the collection that stores documents per some execution Flow.
Every Process includes "processes" and each process includes steps.
So I end up with a 'flows' collection that has documents that look like this:
{
"name" : "flow1",
"description" : "flow 1 description",
"processes" : [
{
"processId" : "firstProcessId",
"name" : "firstProcessName",
"startedAt" : null,
"finishedAt" : null,
"status" : "PENDING",
"steps" : [
{
"stepId" : "foo", ​
​"status" : "PENDING",
​"startedAt" : null,
​"finishedAt" : null
},
{
"stepId" : "bar",​
​"status" : "PENDING",
​"startedAt" : null,
​"finishedAt" : null
}
...
​]
},
{
"processId" : "secondProcessId",
"name" : "secondProcessName",
"startedAt" : null,
"finishedAt" : null,
"status" : "PENDING",
"steps" : [
{
"stepId" : "foo", ​
​"status" : "PENDING",
​"startedAt" : null,
​"finishedAt" : null
},
{
"stepId" : "xyz",​
​"status" : "PENDING",
​"startedAt" : null,
​"finishedAt" : null
}
...
​]
}
}
A couple of notes here:
Each flow contains many processes
Each process contains at least one step, it is possible that in different processes the steps with the same id might appear (id is something that the programmer specifies),
It can be something like "step of bringing me something from the DB", so this is a kind of reusable component in my system.
Now, when the application runs I would like to call DAO's method like
"startProcess", "startStep".
So I would like to know what is the correct query for starting step given processId and steps.
I can successfully update the process description to "running" given the flow Id and the process Id:
db.getCollection('flows').updateOne({"name" : "flow1", "processes" : {$elemMatch : {"processId" : "firstProcessId"}}}, {$set: {"processes.$.status" : "RUNNING"}})
However I don't know how to update the step status given the flowId, process Id and step Id, it looks like it doesn't allow multiple "$" signs in the path:
So, this doesn't work:
db.getCollection('flows').updateOne({"name" : "flow1", "processes" : {$elemMatch : {"processId" : "firstProcessId"}}, "processes.steps.stepId" : {$elemMatch : {"stepId" : "foo"}}}, {$set: {"processes.$.steps.$.status" : "RUNNING"}})
What is the best way to implement such an update?
To update the document in multi-level nested array, you need $[<identifier>] filtered positional operator and arrayFilters.
And the processes and processes.steps.stepId filter in the match operator can be removed as the filter is performed in arrayFilters.
db.collection.update({
"name": "flow1"
},
{
$set: {
"processes.$[process].steps.$[step].status": "RUNNING"
}
},
{
arrayFilters: [
{
"process.processId": "firstProcessId"
},
{
"step.stepId": "foo"
}
]
})
Sample Mongo Playground
Reference
Update Nested Arrays in Conjunction with $[]
As you mentioned it does not work with multiple arrays, straight from the docs:
The positional $ operator cannot be used for queries which traverse more than one array, such as queries that traverse arrays nested within other arrays, because the replacement for the $ placeholder is a single value
I recommend you use arrayFilters instead, it's behavior is much clearer especially when working with nested structures:
db.collection.updateMany(
{
"name": "flow1",
"processes.processId": "firstProcessId",
"processes.steps.stepId": "foo"
},
{
$set: {
"processes.$[process].steps.$[step].status": "RUNNING"
}
},
{
arrayFilters: [
{
"process.processId": "firstProcessId"
},
{
"step.stepId": "foo"
}
]
})
Mongo Playground

How to get the count of element with non-empty-array-field when group in mongodb aggregate using Spring Data Mongo?

I have the following documents in one collection named as mail_test. Some of them have a tags field which is an array:
/* 1 */
{
"_id" : ObjectId("601a7c3a57c6eb4c1efb84ff"),
"email" : "aaaa#bbb.com",
"content" : "11111"
}
/* 2 */
{
"_id" : ObjectId("601a7c5057c6eb4c1efb8590"),
"email" : "aaaa#bbb.com",
"content" : "22222"
}
/* 3 */
{
"_id" : ObjectId("601a7c6d57c6eb4c1efb8675"),
"email" : "aaaa#bbb.com",
"content" : "33333",
"tags" : [
"x"
]
}
/* 4 */
{
"_id" : ObjectId("601a7c8157c6eb4c1efb86f4"),
"email" : "aaaa#bbb.com",
"content" : "4444",
"tags" : [
"yyy",
"zzz"
]
}
There are two documents with non-empty-tags, so I want the result to be 2.
I use the the following statement to aggregate and get the correct tag_count:
db.getCollection('mail_test').aggregate([{$group:{
"_id":null,
"all_count":{$sum:1},
"tag_count":{"$sum":{$cond: [ { $ne: ["$tags", undefined] }, 1, 0]}}
//if replace `undefined` with `null`, I got the tag_count as 4, that is not what I want
//I also have tried `$exists`, but it cannot be used here.
}}])
and the result is:
{
"_id" : null,
"all_count" : 4.0,
"tag_count" : 2.0
}
and I use spring data mongo in java to do this:
private void test(){
Aggregation agg = Aggregation.newAggregation(
Aggregation.match(new Criteria()),//some condition here
Aggregation.group(Fields.fields()).sum(ConditionalOperators.when(Criteria.where("tags").ne(null)).then(1).otherwise(0)).as("tag_count")
//I need an `undefined` instead of `null`,or is there are any other solution?
);
AggregationResults<MailTestGroupResult> results = mongoTemplate.aggregate(agg, MailTest.class, MailTestGroupResult.class);
List<MailTestGroupResult> mappedResults = results.getMappedResults();
int tag_count = mappedResults.get(0).getTag_count();
System.out.println(tag_count);//get 4,wrong
}
I need an undefined instead of null but I don't know how to do this,or is there are any other solution?
You can use Aggregation operators to check if the field tags exists or not with one of the following constructs in the $group stage of your query (to calculate the tag_count value):
"tag_count":{ "$sum": { $cond: [ { $gt: [ { $size: { $ifNull: ["$tags", [] ] }}, 0 ] }, 1, 0] }}
// - OR -
"tag_count":{ "$sum": { $cond: [ $eq: [ { $type: "$tags" }, "array" ] }, 1, 0] }
Both, return the same result (as you had posted).

How to write mongo cli query in mongo-template for $in aggregation

This is how my data looks like
{
"_id" : "2011250546437843117",
"name" : "Book",
"textbook" : [
"Maths",
"Science"
],
"language" : [
"English"
],
"isRead" : true,
"isAvailable" : true
}
I have to filter documents based on textbook,and based on that isRead field should be true or false.
my mongo query is
db.user.aggregate([
{
$match: {
"isAvailable": true
}
},
{
$project: {
"textbook": 1,
"name": 1,
"isread": {
$in: [
"Maths",
"$textbook"
]
}
}
}
]);
I have tried to write this using mongo-template
Aggregation aggregation = newAggregation(match(Criteria.where("isAvailable").is(true)),
project("textbook","name"));
I dont understand how to write the $in operator in project stage.
Thankyou in advance.

Mongo: how to count several arrays in one aggregation via java's mongoTemplate

I have to following Db data:
{user : Tom, CORRECT: {q1, q3}, WRONG : {q2, q4} },
{user : jim, CORRECT: {q1}, WRONG : {q2, q3, q4} },
{user : Tom, CORRECT: {q6}, WRONG : {7} },
I'd like to use aggregation to get a count of each CORRECT\WRONG per user, i.e.
{user : Tom, correctCount : 3, wrongCount : 3},
{user : jim, correctCount : 1, wrongCount : 3},
What I've tried is this:
Aggregation agg = newAggregation(
group("name").
addToSet(correct).as(correct).
addToSet(wrong).as(wrong).
addToSet(partial).as(partial)
);
But for each user I get the full list of data (i.e. q1,q2,q3...), I can always do size() on that list - but it's inneficient. how can I get the count value instead?
Thanks
One way you can go about this is to create an extra field that has the size of those arrays using the $size operator in the $project pipeline step and then group the documents in the $group pipeline to get the accumulated sum as the counts on the new size field:
Mongo shell:
db.collection.aggregate([
{
"$project": {
"user": 1,
"correctSize": { "$size": "$CORRECT" },
"wrongSize": { "$size": "$WRONG" }
}
},
{
"$group": {
"_id": "$user",
"correctCount": { "$sum": "$correctSize" },
"wrongCount": { "$sum": "$wrongSize" }
}
}
])
Java: use SpEL andExpression in the project step to use the $size expression
import static org.springframework.data.mongodb.core.aggregation.Expressions.*; //new
...
Aggregation agg = newAggregation(
project("user")
.andExpression(expression("size", field("CORRECT"))).as("correctSize");
.andExpression(expression("size", field("WRONG"))).as("wrongSize");
group("user")
.sum("correctSize").as("correctCount")
.sum("wrongSize").as("wrongCount")
);

mongo + spring data + aggragate sum

I am looking for a solution without spring data. My project requirement is to do without spring data.
To calculate the sum using aggregate function by mongo command, able to get output. But same by using spring data getting exception.
Sample mongo query :
db.getCollection('events_collection').aggregate(
{ "$match" : { "store_no" : 3201 , "event_id" : 882800} },
{ "$group" : { "_id" : "$load_dt", "event_id": { "$first" : "$event_id" }, "start_dt" : { "$first" : "$start_dt" }, "count" : { "$sum" : 1 } } },
{ "$sort" : { "_id" : 1 } },
{ "$project" : { "load_dt" : "$_id", "ksn_cnt" : "$count", "event_id" : 1, "start_dt" : 1, "_id" : 0 } }
)
Same thing done in java as,
String json = "[ { \"$match\": { \"store_no\": 3201, \"event_id\": 882800 } }, { \"$group\": { \"_id\": \"$load_dt\", \"event_id\": { \"$first\": \"$event_id\" }, \"start_dt\": { \"$first\": \"$start_dt\" }, \"count\": { \"$sum\": 1 } } }, { \"$sort\": { \"_id\": 1 } }, { \"$project\": { \"load_dt\": \"$_id\", \"ksn_cnt\": \"$count\", \"event_id\": 1, \"start_dt\": 1, \"_id\": 0 } } ]";
BasicDBList pipeline = (BasicDBList) JSON.parse(json);
System.out.println(pipeline);
AggregationOutput output = col.aggregate(pipeline);
exception is :
com.mongodb.CommandFailureException: { "serverUsed" : "somrandomserver/10.10.10.10:27001" , "errmsg" : "exception: pipeline element 0 is not an object" , "code" : 15942 , "ok" : 0.0}
Could someone please suggest how to use aggregate function with spring?
Try the following (untested) Spring Data MongoDB aggregation equivalent
import static org.springframework.data.mongodb.core.aggregation.Aggregation.*;
MongoTemplate mongoTemplate = repository.getMongoTemplate();
Aggregation agg = newAggregation(
match(Criteria.where("store_no").is(3201).and("event_id").is(882800)),
group("load_dt")
.first("event_id").as("event_id")
.first("start_dt").as("start_dt")
.count().as("ksn_cnt"),
sort(ASC, previousOperation()),
project("ksn_cnt", "event_id", "start_dt")
.and("load_dt").previousOperation()
.and(previousOperation()).exclude()
);
AggregationResults<OutputType> result = mongoTemplate.aggregate(agg,
"events_collection", OutputType.class);
List<OutputType> mappedResult = result.getMappedResults();
As a first step, filter the input collection by using a match operation which accepts a Criteria query as an argument.
In the second step, group the intermediate filtered documents by the "load_dt" field and calculate the document count and store the result in the new field "ksn_cnt".
Sort the intermediate result by the id-reference of the previous group operation as given by the previousOperation() method.
Finally in the fourth step, select the "ksn_cnt", "event_id", and "start_dt" fields from the previous group operation. Note that "load_dt" again implicitly references an group-id field. Since you do not want an implicit generated id to appear, exclude the id from the previous operation via and(previousOperation()).exclude().
Note that if you provide an input class as the first parameter to the newAggregation method the MongoTemplate will derive the name of the input collection from this class. Otherwise if you don’t not specify an input class you must provide the name of the input collection explicitly. If an input-class and an input-collection is provided the latter takes precedence.

Categories