I'm new to working with MongoDb and do not know a lot of things.
I need to write an aggregation request.
Here is the JSON document structure.
{
"_id" : ObjectId("5a72f7a75ef7d430e8c462d2"),
"crawler_id" : ObjectId("5a71cbb746e0fb0007adc6c2"),
"skill" : "stack",
"created_date" : ISODate("2018-02-01T13:19:03.522+0000"),
"modified_date" : ISODate("2018-02-01T13:22:23.078+0000"),
"connects" : [
{
"subskill" : "we’re",
"weight" : NumberInt(1),
"parser_id" : [
ObjectId("5a71d88d5ef7d41964fbec11")
]
},
{
"subskill" : "b1",
"weight" : NumberInt(2),
"parser_id" : [
ObjectId("5a71d88d5ef7d41964fbec11"),
ObjectId("5a71d88d5ef7d41964fbec1b")
]
},
{
"subskill" : "making",
"weight" : NumberInt(2),
"parser_id" : [
ObjectId("5a71d88d5ef7d41964fbec1b"),
ObjectId("5a71d88d5ef7d41964fbec1c")
]
},
{
"subskill" : "delivery",
"weight" : NumberInt(2),
"parser_id" : [
ObjectId("5a71d88d5ef7d41964fbec1c"),
ObjectId("5a71d88d5ef7d41964fbec1e")
]
}
]
}
I need the result return the name of skill and the number of unique parser_id.
In this case, the result should be:
[
{
"skill": "stack",
"quantity": 4
}
]
where "stack" - skill name,
and "quantity" - count of unique parser_id.
ObjectId("5a71d88d5ef7d41964fbec11")
ObjectId("5a71d88d5ef7d41964fbec1b")
ObjectId("5a71d88d5ef7d41964fbec1c")
ObjectId("5a71d88d5ef7d41964fbec1e")
Can some one help me with this request ???
Given the document supplied in your question, this command ...
db.collection.aggregate([
{ $unwind: "$connects" },
// count all occurrences
{ "$group": { "_id": {skill: "$skill", parser_id: "$connects.parser_id"}, "count": { "$sum": 1 } }},
// sum all occurrences and count distinct
{ "$group": { "_id": "$_id.skill", "quantity": { "$sum": 1 } }},
// (optional) rename the '_id' attribute to 'skill'
{ $project: { 'skill': '$_id', 'quantity': 1, _id: 0 } }
])
... will return:
{
"quantity" : 4,
"skill" : "stack"
}
The above command groups by skill and connects.parser_id and then gets a distinct count of those groups.
Your command includes the java tag so I suspect you are looking to execute the same command using the MongoDB Java driver. The code below (using MongoDB Java driver v3.x) will return the same result:
MongoClient mongoClient = ...;
MongoCollection<Document> collection = mongoClient.getDatabase("...").getCollection("...");
List<Document> documents = collection.aggregate(Arrays.asList(
Aggregates.unwind("$connects"),
new Document("$group", new Document("_id", new Document("skill", "$skill").append("parser_id", "$connects.parser_id"))
.append("count", new Document("$sum", 1))),
new Document("$group", new Document("_id", "$_id.skill").append("quantity", new Document("$sum", 1))),
new Document("$project", new Document("skill", "$_id").append("quantity", 1).append("_id", 0))
)).into(new ArrayList<>());
for (Document document : documents) {
logger.info("{}", document.toJson());
}
Note: this code deliberately uses the form new Document(<pipeline aggregator>, ...) instead of the Aggregators utilities to make it easier to see the translation between the shell command and its Java equivalent.
try $project with $reduce
$setUnion is used to keep only the distinct ids and finally $size used to get the distinct array count
db.col.aggregate(
[
{$project : {
_id : 0,
skill : 1,
quantity : {$size :{$reduce : {input : "$connects.parser_id", initialValue : [] , in : {$setUnion : ["$$value", "$$this"]}}}}
}
}
]
).pretty()
result
{ "skill" : "stack", "quantity" : 4 }
Related
I have the following mongodb document:
{
"_id" : ObjectId("5f283e7d39187d9ab77e7ece"),
"resourceType" : "VM",
"resourceInstanceName" : "virtual_machine_1",
"properties" :
{ "name" : "CentOS-VM", "cpu" : 2, "memory_in_gb" : 2, }
},
{
"_id" : ObjectId("5f28jh58hjf9ab77e7ece"),
"resourceType" : "VM",
"resourceInstanceName" : "virtual_machine_2",
"properties" :
{ "name" : "CentOS-VM", "cpu" : 8, "memory_in_gb" : 8, }
}
I use the following query in mongo shell which works fine
db.collection.aggregate({
$match:
{ "resourceType":"VM" }
}, {
$group: {
_id: '',
instance:
{ $sum: 1 }
,
cpu:
{ $sum: '$properties.cpu' }
,
memory_in_gb:
{ $sum: '$properties.memory_in_gb'}
}
})
and the output was
{ "_id" : "", "instance" : 2.0, "cpu" : 10, "memory_in_gb" : 10 }
using spring-data I have written the following code to produce the same result but it ends up in an Exception:
MatchOperation matchOperation = Aggregation.match(Criteria.where(RESOURCE_TYPE).is("VM"));
UnwindOperation unwindOperation = Aggregation.unwind("properties");
GroupOperation groupOperation = Aggregation.group(ID);
GroupOperation instanceOperation = Aggregation.group().count().as("instance");
GroupOperation cpuOperation = Aggregation.group("properties").sum("properties.cpu").as("cpu");
GroupOperation memoryOperation = Aggregation.group("properties").sum("properties.memory_in_gb").as("memory_in_gb");
Aggregation aggregation = Aggregation.newAggregation(matchOperation,unwindOperation,groupOperation,
instanceOperation, cpuOperation, memoryOperation);
return mongoTemplate.aggregate(aggregation, COLLECTION_NAME, Map.class).getMappedResults();
Here is the stackTrace:
[http-nio-9083-exec-2] ERROR c.s.c.c.v.service.ComputeService.getComputeSummary - Error in Cloud Account-summary :java.lang.IllegalArgumentException: Invalid reference 'properties'!
at org.springframework.data.mongodb.core.aggregation.ExposedFieldsAggregationOperationContext.getReference(ExposedFieldsAggregationOperationContext.java:114)
at org.springframework.data.mongodb.core.aggregation.ExposedFieldsAggregationOperationContext.getReference(ExposedFieldsAggregationOperationContext.java:77)
at org.springframework.data.mongodb.core.aggregation.AbstractAggregationExpression.unpack(AbstractAggregationExpression.java:74)
I have the following documents in one collection named as mail_test. Some of them have a tags field which is an array:
/* 1 */
{
"_id" : ObjectId("601a7c3a57c6eb4c1efb84ff"),
"email" : "aaaa#bbb.com",
"content" : "11111"
}
/* 2 */
{
"_id" : ObjectId("601a7c5057c6eb4c1efb8590"),
"email" : "aaaa#bbb.com",
"content" : "22222"
}
/* 3 */
{
"_id" : ObjectId("601a7c6d57c6eb4c1efb8675"),
"email" : "aaaa#bbb.com",
"content" : "33333",
"tags" : [
"x"
]
}
/* 4 */
{
"_id" : ObjectId("601a7c8157c6eb4c1efb86f4"),
"email" : "aaaa#bbb.com",
"content" : "4444",
"tags" : [
"yyy",
"zzz"
]
}
There are two documents with non-empty-tags, so I want the result to be 2.
I use the the following statement to aggregate and get the correct tag_count:
db.getCollection('mail_test').aggregate([{$group:{
"_id":null,
"all_count":{$sum:1},
"tag_count":{"$sum":{$cond: [ { $ne: ["$tags", undefined] }, 1, 0]}}
//if replace `undefined` with `null`, I got the tag_count as 4, that is not what I want
//I also have tried `$exists`, but it cannot be used here.
}}])
and the result is:
{
"_id" : null,
"all_count" : 4.0,
"tag_count" : 2.0
}
and I use spring data mongo in java to do this:
private void test(){
Aggregation agg = Aggregation.newAggregation(
Aggregation.match(new Criteria()),//some condition here
Aggregation.group(Fields.fields()).sum(ConditionalOperators.when(Criteria.where("tags").ne(null)).then(1).otherwise(0)).as("tag_count")
//I need an `undefined` instead of `null`,or is there are any other solution?
);
AggregationResults<MailTestGroupResult> results = mongoTemplate.aggregate(agg, MailTest.class, MailTestGroupResult.class);
List<MailTestGroupResult> mappedResults = results.getMappedResults();
int tag_count = mappedResults.get(0).getTag_count();
System.out.println(tag_count);//get 4,wrong
}
I need an undefined instead of null but I don't know how to do this,or is there are any other solution?
You can use Aggregation operators to check if the field tags exists or not with one of the following constructs in the $group stage of your query (to calculate the tag_count value):
"tag_count":{ "$sum": { $cond: [ { $gt: [ { $size: { $ifNull: ["$tags", [] ] }}, 0 ] }, 1, 0] }}
// - OR -
"tag_count":{ "$sum": { $cond: [ $eq: [ { $type: "$tags" }, "array" ] }, 1, 0] }
Both, return the same result (as you had posted).
I'am trying to fetch all documents in a collections, where any of the document field can match to any of the listed regular expressions.
Considering below scenarios.
User can create documents with different fields names as they wish in a collection.
such as
document1 = >{ "_id":1, "card" : 1234 , "status": 4}
document2 => {"_id": ***, "Housenumber" : 356/78 , "value" : null}
------
documentn =>{ "_id" : ObjectId("4ecd2e33dd68c9021e453d12"), "searchword" : "win" }
------
Field names are not same for all the documents in a collection.
regular expressions can be:"/^(^456$|^win$............etc)/"
I tried to get key dynamically and do find query as mentioned below:
----------
table = db.getCollection(coll);
DBObject dataKeys = table.findOne();
Set<String> keys = dataKeys.keySet();
Iterator<String> iterator = keys.iterator();
while(iterator.hasNext()){
String key = iterator.next();
regexQuery.put(**key**, new BasicDBObject("$regex", "^((^(([0-9]{4}[-. _]?)$)|"
+ "(^[a-zA-Z0-9._%+-]...........................0-9]$$").append("$options", "i"));
DBCursor cursor = table.find(regexQuery);
while (cursor.hasNext()) {
System.out.println(cursor.next());
I can see key value is coming properly but it is not fetching the matching documents.
I am new to MongoDB and I followed above approach after googling it.
If you are looking to regex match on the field names (not the values), then use $objectToArray to turn the field names (LHS) into expression-worthy values (RHS):
var r = [
{ _id: 1, name: "buzz", addr: "here"}
,{ _id: 2, searchword: "win", value: 6}
,{ _id: 3, game:0, word: "foo", fruit: "apple", fame: 7}
,{ _id: 4, qval:23}
];
db.foo.insert(r);
var rin = [ /ame/, /^val/ ]; // list of regex
db.foo.aggregate([
{$project: {x: {$objectToArray: "$$CURRENT"}}}
,{$unwind: "$x"}
,{$match: {"x.k": {$in: rin}}}
]);
{ "_id" : 1, "x" : { "k" : "name", "v" : "buzz" } }
{ "_id" : 2, "x" : { "k" : "value", "v" : 6 } }
{ "_id" : 3, "x" : { "k" : "game", "v" : 0 } }
{ "_id" : 3, "x" : { "k" : "fame", "v" : 7 } }
I'm using mongo-java-driver:3.3.0 and trying to update one value of my sub-document using $inc operator and findOneAndUpdate, but only under certain conditions (id comparison and greaterThan filter).
Following is a snippet to reproduce the problem:
MongoCollection<Document> coll = db.getCollection("update_increase");
Document docBefore = new Document()
.append("subdocs", Arrays.asList(
new Document("id", "AAA").append("count", 10),
new Document("id", "BBB").append("count", 20)
));
coll.insertOne(docBefore);
Document filter = new Document()
.append("subdocs.id", "BBB")
.append("subdocs.count", new Document("$gt", 7));
Document update = new Document()
.append("$inc", new Document("subdocs.$.count", -7));
Document docAfter = coll.findOneAndUpdate(
filter,
update,
new FindOneAndUpdateOptions().returnDocument(ReturnDocument.AFTER));
docBefore:
{ "_id" : { "$oid" : "5819c85977a8cb12f8d706c9" },
"subdocs" : [
{ "id" : "AAA", "count" : 10 },
{ "id" : "BBB", "count" : 20 }
]
}
docAfter:
{ "_id" : { "$oid" : "5819c85977a8cb12f8d706c9" },
"subdocs" : [
{ "id" : "AAA", "count" : 3 },
{ "id" : "BBB", "count" : 20 }
]
}
What I expected is count:13 on the second subdoc (id:"BBB"), but I got an update on the first one (count:3).
This works fine if I remove greaterThan condition line (.. new Document("$gt", 5) ..):
{ "_id" : { "$oid" : "5819c92577a8cb13404cfc91" },
"subdocs" : [
{ "id" : "AAA", "count" : 10 },
{ "id" : "BBB", "count" : 13 }
]
}
What I'm doing wrong?
Thanks!
Here is the java equlivant for $elemMatch.
Document filter = new Document("subdocs", new Document().append("$elemMatch", new Document().append("id", "BBB").append("count", new Document("$gt", 7))));
I am not able to remove object from an array named Matrix for a Key match
BasicDBObject where = new BasicDBObject();
where.put("INSTITUTION_ID", instid);
where.put("RuleID", ruleid);
BasicDBObject obj1 = new BasicDBObject();
obj1.put("Matrix.Key",new BasicDBObject("$regex","/"+json.getString("Code")+"$/"));
collection.update(where,new BasicDBObject("$pull", obj1));
The code above is not removing object from array. The structure of the array can be found below
"Matrix" : [
{
"Key" : "6M",
"value" : "Queue"
},
{
"Key" : "6N",
"value" : "Queue"
},
{
"Key" : "6O",
"value" : "Queue"
}]
Command-line client
I suggest that before writing queries in Java notation, you first test them in the mongo console, with the regular JavaScript syntax. The following query works for me.
Data
db.matrix.insert(
{
INSTITUTION_ID: 1,
RuleID: 2,
Matrix: [
{
"Key": "6M",
"value": "Queue"
},
{
"Key": "6N",
"value": "Queue"
},
{
"Key": "6O",
"value": "Queue"
}
]
})
Query
db.matrix.update(
{
INSTITUTION_ID: 1,
RuleID: 2,
},
{
$pull:
{
Matrix:
{
Key:
{
$regex: /M$/
}
}
}
})
Data after the update
{
"INSTITUTION_ID" : 1.0000000000000000,
"RuleID" : 2.0000000000000000,
"Matrix" : [
{
"Key" : "6N",
"value" : "Queue"
},
{
"Key" : "6O",
"value" : "Queue"
}
]
}
Java
I am not sure how this update query should be represented in Java, but try this:
BasicDBObject where =
new BasicDBObject()
.put("INSTITUTION_ID", instid);
.put("RuleID", ruleid);
BasicDBObject update =
new BasicDBObject("$pull",
new BasicDBObject("Matrix",
new BasicDBObject("Key",
new BasicDBObject("$regex",
java.util.regex.Pattern.compile(json.getString("Code") + "$")))));
collection.update(where, update);