i want to see how many documents I have in a collection with the same base path: my mongoDb is structured like this:
{
{_id:1,
tag1: a,
path: C:\Users\A\Downloads\1\qwerty
},
{
_id: 2,
tag1: b,
path: C:\Users\A\Downloads\2\abcd
},
{
_id: 3,
tag1: alfa,
path: C:\Users\A\Documents\3\fsdf
},
{
_id: 4,
tag1: beta,
path: C:\Users\A\Documents\4\aaa
}
}
I want to search, for example, how many elements there are in C:\Users\A\Downloads and how many elements there are in C:\Users\A\Documents. How can I do it?
i want to see how many documents I have in a collection with the same
base path:
Assuming you are supplying the base path to find the number of documents with that base path - the following regex query will count all the documents with the path field value staring with "C:\Users\A\Downloads\".
db.paths.find( { path: /^C:\\Users\\A\\Downloads\\/ } ).count()
Code Using MongoDB Java Driver:
Pattern p = Pattern.compile("^C:\\\\Users\\\\A\\\\Documents\\\\");
Bson queryFilter = regex("path", p);
long count = collection.countDocuments(filter);
The following is the data I am using; there are 4 documents. When I run the code I get a count of 2 (which is correct and expected as there are two paths which match the pattern "^C:\\Users\\A\\Documents\\").
Using the same data as shown in the Compass screenshot, the following aggregation
db.paths.aggregate( [
{
$group : {
_id : "Counts",
Documents: {
$sum: {
$cond: [ { $regexMatch: { input: "$path" , regex: /^C:\\Users\\A\\Documents\\/ } }, 1, 0 ]
}
},
Downloads: {
$sum: {
$cond: [ { $regexMatch: { input: "$path" , regex: /^C:\\Users\\A\\Downloads\\/ } }, 1, 0 ]
}
}
}
},
] )
prints:
{ "_id" : "Counts", "Documents" : 2, "Downloads" : 1 }
The Java code for the above aggregation:
Pattern docPattern = Pattern.compile("^C:\\\\Users\\\\A\\\\Documents\\\\");
Pattern downloadPattern = Pattern.compile("^C:\\\\Users\\\\A\\\\Downloads\\\\");
List<Bson> pipeline =
Arrays.asList(new Document("$group",
new Document("_id", "Counts")
.append("document_counts",
new Document("$sum",
new Document("$cond",
Arrays.asList(
new Document("$regexMatch",
new Document("input", "$path")
.append("regex", docPattern)),
1L, 0L
)
)
)
)
.append("download_counts",
new Document("$sum",
new Document("$cond",
Arrays.asList(
new Document("$regexMatch",
new Document("input", "$path")
.append("regex", downloadPattern)),
1L, 0L
)
)
)
)
),
project(excludeId())
);
List<Document> results = new ArrayList<>();
collection.aggregate(pipeline).into(results);
results.forEach(System.out::println);
The result document:
Document{ { document_counts=2, download_counts=1 } }
Related
I'm new to working with MongoDb and do not know a lot of things.
I need to write an aggregation request.
Here is the JSON document structure.
{
"_id" : ObjectId("5a72f7a75ef7d430e8c462d2"),
"crawler_id" : ObjectId("5a71cbb746e0fb0007adc6c2"),
"skill" : "stack",
"created_date" : ISODate("2018-02-01T13:19:03.522+0000"),
"modified_date" : ISODate("2018-02-01T13:22:23.078+0000"),
"connects" : [
{
"subskill" : "we’re",
"weight" : NumberInt(1),
"parser_id" : [
ObjectId("5a71d88d5ef7d41964fbec11")
]
},
{
"subskill" : "b1",
"weight" : NumberInt(2),
"parser_id" : [
ObjectId("5a71d88d5ef7d41964fbec11"),
ObjectId("5a71d88d5ef7d41964fbec1b")
]
},
{
"subskill" : "making",
"weight" : NumberInt(2),
"parser_id" : [
ObjectId("5a71d88d5ef7d41964fbec1b"),
ObjectId("5a71d88d5ef7d41964fbec1c")
]
},
{
"subskill" : "delivery",
"weight" : NumberInt(2),
"parser_id" : [
ObjectId("5a71d88d5ef7d41964fbec1c"),
ObjectId("5a71d88d5ef7d41964fbec1e")
]
}
]
}
I need the result return the name of skill and the number of unique parser_id.
In this case, the result should be:
[
{
"skill": "stack",
"quantity": 4
}
]
where "stack" - skill name,
and "quantity" - count of unique parser_id.
ObjectId("5a71d88d5ef7d41964fbec11")
ObjectId("5a71d88d5ef7d41964fbec1b")
ObjectId("5a71d88d5ef7d41964fbec1c")
ObjectId("5a71d88d5ef7d41964fbec1e")
Can some one help me with this request ???
Given the document supplied in your question, this command ...
db.collection.aggregate([
{ $unwind: "$connects" },
// count all occurrences
{ "$group": { "_id": {skill: "$skill", parser_id: "$connects.parser_id"}, "count": { "$sum": 1 } }},
// sum all occurrences and count distinct
{ "$group": { "_id": "$_id.skill", "quantity": { "$sum": 1 } }},
// (optional) rename the '_id' attribute to 'skill'
{ $project: { 'skill': '$_id', 'quantity': 1, _id: 0 } }
])
... will return:
{
"quantity" : 4,
"skill" : "stack"
}
The above command groups by skill and connects.parser_id and then gets a distinct count of those groups.
Your command includes the java tag so I suspect you are looking to execute the same command using the MongoDB Java driver. The code below (using MongoDB Java driver v3.x) will return the same result:
MongoClient mongoClient = ...;
MongoCollection<Document> collection = mongoClient.getDatabase("...").getCollection("...");
List<Document> documents = collection.aggregate(Arrays.asList(
Aggregates.unwind("$connects"),
new Document("$group", new Document("_id", new Document("skill", "$skill").append("parser_id", "$connects.parser_id"))
.append("count", new Document("$sum", 1))),
new Document("$group", new Document("_id", "$_id.skill").append("quantity", new Document("$sum", 1))),
new Document("$project", new Document("skill", "$_id").append("quantity", 1).append("_id", 0))
)).into(new ArrayList<>());
for (Document document : documents) {
logger.info("{}", document.toJson());
}
Note: this code deliberately uses the form new Document(<pipeline aggregator>, ...) instead of the Aggregators utilities to make it easier to see the translation between the shell command and its Java equivalent.
try $project with $reduce
$setUnion is used to keep only the distinct ids and finally $size used to get the distinct array count
db.col.aggregate(
[
{$project : {
_id : 0,
skill : 1,
quantity : {$size :{$reduce : {input : "$connects.parser_id", initialValue : [] , in : {$setUnion : ["$$value", "$$this"]}}}}
}
}
]
).pretty()
result
{ "skill" : "stack", "quantity" : 4 }
I'am trying to fetch all documents in a collections, where any of the document field can match to any of the listed regular expressions.
Considering below scenarios.
User can create documents with different fields names as they wish in a collection.
such as
document1 = >{ "_id":1, "card" : 1234 , "status": 4}
document2 => {"_id": ***, "Housenumber" : 356/78 , "value" : null}
------
documentn =>{ "_id" : ObjectId("4ecd2e33dd68c9021e453d12"), "searchword" : "win" }
------
Field names are not same for all the documents in a collection.
regular expressions can be:"/^(^456$|^win$............etc)/"
I tried to get key dynamically and do find query as mentioned below:
----------
table = db.getCollection(coll);
DBObject dataKeys = table.findOne();
Set<String> keys = dataKeys.keySet();
Iterator<String> iterator = keys.iterator();
while(iterator.hasNext()){
String key = iterator.next();
regexQuery.put(**key**, new BasicDBObject("$regex", "^((^(([0-9]{4}[-. _]?)$)|"
+ "(^[a-zA-Z0-9._%+-]...........................0-9]$$").append("$options", "i"));
DBCursor cursor = table.find(regexQuery);
while (cursor.hasNext()) {
System.out.println(cursor.next());
I can see key value is coming properly but it is not fetching the matching documents.
I am new to MongoDB and I followed above approach after googling it.
If you are looking to regex match on the field names (not the values), then use $objectToArray to turn the field names (LHS) into expression-worthy values (RHS):
var r = [
{ _id: 1, name: "buzz", addr: "here"}
,{ _id: 2, searchword: "win", value: 6}
,{ _id: 3, game:0, word: "foo", fruit: "apple", fame: 7}
,{ _id: 4, qval:23}
];
db.foo.insert(r);
var rin = [ /ame/, /^val/ ]; // list of regex
db.foo.aggregate([
{$project: {x: {$objectToArray: "$$CURRENT"}}}
,{$unwind: "$x"}
,{$match: {"x.k": {$in: rin}}}
]);
{ "_id" : 1, "x" : { "k" : "name", "v" : "buzz" } }
{ "_id" : 2, "x" : { "k" : "value", "v" : 6 } }
{ "_id" : 3, "x" : { "k" : "game", "v" : 0 } }
{ "_id" : 3, "x" : { "k" : "fame", "v" : 7 } }
I'm using MongoDB java driver 3.2.2 to do some aggregation operations, but I'm not sure if something could be achieved through it.
The original query in MongoDB is:
db.getCollection('report').aggregate({
$group: {
_id: "$company_id",
count: {
$sum: {
$cond: [{
$eq: ["$idcard.status", "normal"]
},0,1]
}
}
}
})
I have no idea of how to put the "$cond" as a parameter of "$sum" operator in Java driver in the code section below:
AggregateIterable<Document> res = col.aggregate(Arrays.asList(
group("$company_id",
sum("count", ...)
)));
I've searched the official document about this with no result, anyone has experience of doing this? Thanks.
For 3.x drivers
Using BsonDocument : Type Safe Version
BsonArray cond = new BsonArray();
BsonArray eq = new BsonArray();
eq.add(new BsonString("$idcard.status"));
eq.add(new BsonString("normal"));
cond.add(new BsonDocument("$eq", eq));
cond.add(new BsonInt64(0));
cond.add(new BsonInt64(1));
AggregateIterable<BsonDocument> aggregate = dbCollection.aggregate(Arrays.asList(
group("$company_id",
sum("count", new BsonDocument("$cond", cond))
)));
Using Document - Less Code but Not Type Safe
List cond = new ArrayList();
cond.add(new Document("$eq", Arrays.asList("$idcard.status", "normal")));
cond.add(0);
cond.add(1);
AggregateIterable<Document> aggregate = dbCollection.aggregate(Arrays.asList(
group("$company_id",
sum("count", new Document("$cond", cond))
)));
To use $cond in Java use ArrayList.
{ $cond: [ { $eq: ["$idcard.status", "normal"] },0,1]
// To Acheive this - [ "$idcard.status", "normal" ]
ArrayList eqArrayList = new ArrayList();
eqArrayList.add("$idcard.status");
eqArrayList.add("normal");
// To Acheive this - [ { $eq: [ "$idcard.status", "normal" ] } , 1, 0 ]
ArrayList condArray = new ArrayList();
condArray.add(new BasicDBObject("$eq", eqArrayList));
condArray.add(1);
condArray.add(0);
// Finally - { $cond: [ { $eq: ["$idcard.status", "normal" ] } , 1, 0 ] }
BasicDBObject fullCond = new BasicDBObject("$cond", condArray);
Also see: MongoDB aggregation condition translated to JAVA driver
I have to following Db data:
{user : Tom, CORRECT: {q1, q3}, WRONG : {q2, q4} },
{user : jim, CORRECT: {q1}, WRONG : {q2, q3, q4} },
{user : Tom, CORRECT: {q6}, WRONG : {7} },
I'd like to use aggregation to get a count of each CORRECT\WRONG per user, i.e.
{user : Tom, correctCount : 3, wrongCount : 3},
{user : jim, correctCount : 1, wrongCount : 3},
What I've tried is this:
Aggregation agg = newAggregation(
group("name").
addToSet(correct).as(correct).
addToSet(wrong).as(wrong).
addToSet(partial).as(partial)
);
But for each user I get the full list of data (i.e. q1,q2,q3...), I can always do size() on that list - but it's inneficient. how can I get the count value instead?
Thanks
One way you can go about this is to create an extra field that has the size of those arrays using the $size operator in the $project pipeline step and then group the documents in the $group pipeline to get the accumulated sum as the counts on the new size field:
Mongo shell:
db.collection.aggregate([
{
"$project": {
"user": 1,
"correctSize": { "$size": "$CORRECT" },
"wrongSize": { "$size": "$WRONG" }
}
},
{
"$group": {
"_id": "$user",
"correctCount": { "$sum": "$correctSize" },
"wrongCount": { "$sum": "$wrongSize" }
}
}
])
Java: use SpEL andExpression in the project step to use the $size expression
import static org.springframework.data.mongodb.core.aggregation.Expressions.*; //new
...
Aggregation agg = newAggregation(
project("user")
.andExpression(expression("size", field("CORRECT"))).as("correctSize");
.andExpression(expression("size", field("WRONG"))).as("wrongSize");
group("user")
.sum("correctSize").as("correctCount")
.sum("wrongSize").as("wrongCount")
);
I am looking for a solution without spring data. My project requirement is to do without spring data.
To calculate the sum using aggregate function by mongo command, able to get output. But same by using spring data getting exception.
Sample mongo query :
db.getCollection('events_collection').aggregate(
{ "$match" : { "store_no" : 3201 , "event_id" : 882800} },
{ "$group" : { "_id" : "$load_dt", "event_id": { "$first" : "$event_id" }, "start_dt" : { "$first" : "$start_dt" }, "count" : { "$sum" : 1 } } },
{ "$sort" : { "_id" : 1 } },
{ "$project" : { "load_dt" : "$_id", "ksn_cnt" : "$count", "event_id" : 1, "start_dt" : 1, "_id" : 0 } }
)
Same thing done in java as,
String json = "[ { \"$match\": { \"store_no\": 3201, \"event_id\": 882800 } }, { \"$group\": { \"_id\": \"$load_dt\", \"event_id\": { \"$first\": \"$event_id\" }, \"start_dt\": { \"$first\": \"$start_dt\" }, \"count\": { \"$sum\": 1 } } }, { \"$sort\": { \"_id\": 1 } }, { \"$project\": { \"load_dt\": \"$_id\", \"ksn_cnt\": \"$count\", \"event_id\": 1, \"start_dt\": 1, \"_id\": 0 } } ]";
BasicDBList pipeline = (BasicDBList) JSON.parse(json);
System.out.println(pipeline);
AggregationOutput output = col.aggregate(pipeline);
exception is :
com.mongodb.CommandFailureException: { "serverUsed" : "somrandomserver/10.10.10.10:27001" , "errmsg" : "exception: pipeline element 0 is not an object" , "code" : 15942 , "ok" : 0.0}
Could someone please suggest how to use aggregate function with spring?
Try the following (untested) Spring Data MongoDB aggregation equivalent
import static org.springframework.data.mongodb.core.aggregation.Aggregation.*;
MongoTemplate mongoTemplate = repository.getMongoTemplate();
Aggregation agg = newAggregation(
match(Criteria.where("store_no").is(3201).and("event_id").is(882800)),
group("load_dt")
.first("event_id").as("event_id")
.first("start_dt").as("start_dt")
.count().as("ksn_cnt"),
sort(ASC, previousOperation()),
project("ksn_cnt", "event_id", "start_dt")
.and("load_dt").previousOperation()
.and(previousOperation()).exclude()
);
AggregationResults<OutputType> result = mongoTemplate.aggregate(agg,
"events_collection", OutputType.class);
List<OutputType> mappedResult = result.getMappedResults();
As a first step, filter the input collection by using a match operation which accepts a Criteria query as an argument.
In the second step, group the intermediate filtered documents by the "load_dt" field and calculate the document count and store the result in the new field "ksn_cnt".
Sort the intermediate result by the id-reference of the previous group operation as given by the previousOperation() method.
Finally in the fourth step, select the "ksn_cnt", "event_id", and "start_dt" fields from the previous group operation. Note that "load_dt" again implicitly references an group-id field. Since you do not want an implicit generated id to appear, exclude the id from the previous operation via and(previousOperation()).exclude().
Note that if you provide an input class as the first parameter to the newAggregation method the MongoTemplate will derive the name of the input collection from this class. Otherwise if you don’t not specify an input class you must provide the name of the input collection explicitly. If an input-class and an input-collection is provided the latter takes precedence.