Definition
I'm creating searching application and mongo db is used to store searching information. This is example dataset of collection "Resource".
{
_id:"5b3b84e02360a26f9a9ae96e",
name:"Advanced Java",
keywords:[
"java", "thread", "state", "public", "void"
]
},
{
_id:"5b3b84e02360a26f9a9ae96f",
name:"Java In Simple",
keywords:[
"java", "runnable", "thread", "sleep", "array"
]
}
This contains name of books and most frequent words (in keywords array) of each. I'm using spring framework with mongo template. If I run below code,
MongoOperations mongoOperations = new MongoTemplate(new MongoClient("127.0.0.1", 27017), "ResourceDB");
Query query = new Query(where("keywords").in("java", "thread", "sleep"));
List<Resource> resources = mongoOperations.find(query, Resource.class);
It results both "Advanced Java" and "Java In Simple" and its ok.
Problem
But in my case, I need them in order. Because "Java In Simple" match 3 words and "Advanced Java" matches only 2 words. So possibility of most relevant book should be "Java In Simple" and it should be in first.
Expecting Order
Java In Simple
Advanced Java
Is it possible to get result in matching order. Or is there any way to get number of matches for each item. For example If is search for ("java", "thread", "sleep"), I'm expecting output like below.
Advanced Java - 2 matches
Java in Simple - 3 matches
Any help appreciated.
$in doesn't match 3 or 2 items. It stops after first match. You need to use aggregation pipeline to calculate intersection of keywords and the array from the query and order by size of the result:
db.collection.aggregate([
{ $addFields: {
matchedTags: { $size: {
$setIntersection: [ "$keywords", [ "java", "thread", "sleep" ] ]
} }
} },
{ $match: { matchedTags: { $gt: 0 } } },
{ $sort: { matchedTags: -1 } }
])
This is for someone who looking to run #Alex Blex's query in java. It looks like mongo template does not have implementation for intersection. So I have done it using mongoDB java client.
List<String> keywords = Arrays.asList("java", "thread", "sleep");
BasicDBList intersectionList = new BasicDBList();
intersectionList.add("$keywords");
intersectionList.add(keywords);
AggregateIterable<Document> aggregate = new MongoClient("127.0.0.1", 27017).getDatabase("ResourceDB").getCollection("Resource").aggregate(
Arrays.asList(
new BasicDBObject("$addFields",
new BasicDBObject("matchedTags",
new BasicDBObject("$size",
new BasicDBObject("$setIntersection", intersectionList)))),
new BasicDBObject("$match",
new BasicDBObject("matchedTags",
new BasicDBObject("$gt", 0))),
new BasicDBObject("$sort",
new BasicDBObject("matchedTags", -1))
)
);
MongoCursor<Document> iterator = aggregate.iterator();
while (iterator.hasNext()){
Document document = iterator.next();
System.out.println(document.get("name")+" - "+document.get("matchedTags"));
}
Related
I'm using MongoDB java driver 3.2.2 to do some aggregation operations, but I'm not sure if something could be achieved through it.
The original query in MongoDB is:
db.getCollection('report').aggregate({
$group: {
_id: "$company_id",
count: {
$sum: {
$cond: [{
$eq: ["$idcard.status", "normal"]
},0,1]
}
}
}
})
I have no idea of how to put the "$cond" as a parameter of "$sum" operator in Java driver in the code section below:
AggregateIterable<Document> res = col.aggregate(Arrays.asList(
group("$company_id",
sum("count", ...)
)));
I've searched the official document about this with no result, anyone has experience of doing this? Thanks.
For 3.x drivers
Using BsonDocument : Type Safe Version
BsonArray cond = new BsonArray();
BsonArray eq = new BsonArray();
eq.add(new BsonString("$idcard.status"));
eq.add(new BsonString("normal"));
cond.add(new BsonDocument("$eq", eq));
cond.add(new BsonInt64(0));
cond.add(new BsonInt64(1));
AggregateIterable<BsonDocument> aggregate = dbCollection.aggregate(Arrays.asList(
group("$company_id",
sum("count", new BsonDocument("$cond", cond))
)));
Using Document - Less Code but Not Type Safe
List cond = new ArrayList();
cond.add(new Document("$eq", Arrays.asList("$idcard.status", "normal")));
cond.add(0);
cond.add(1);
AggregateIterable<Document> aggregate = dbCollection.aggregate(Arrays.asList(
group("$company_id",
sum("count", new Document("$cond", cond))
)));
To use $cond in Java use ArrayList.
{ $cond: [ { $eq: ["$idcard.status", "normal"] },0,1]
// To Acheive this - [ "$idcard.status", "normal" ]
ArrayList eqArrayList = new ArrayList();
eqArrayList.add("$idcard.status");
eqArrayList.add("normal");
// To Acheive this - [ { $eq: [ "$idcard.status", "normal" ] } , 1, 0 ]
ArrayList condArray = new ArrayList();
condArray.add(new BasicDBObject("$eq", eqArrayList));
condArray.add(1);
condArray.add(0);
// Finally - { $cond: [ { $eq: ["$idcard.status", "normal" ] } , 1, 0 ] }
BasicDBObject fullCond = new BasicDBObject("$cond", condArray);
Also see: MongoDB aggregation condition translated to JAVA driver
I'm trying to get data from mongoDB without repeat values. I want to filter following data
{"page":"www.abc.com","impressions":1,"position":144}
{"page":"www.abc.com","impressions":1,"position":8}
{"page":"www.xyz.com","impressions":7,"position":4}
{"page":"www.pqr.com","impressions":1,"position":7}
{"page":"www.abc.com","impressions":1,"position":19}
to filter as following. any idea how should I do that ?
{"page":"www.xyz.com","impressions":7,"position":4}
{"page":"www.pqr.com","impressions":1,"position":7}
In java for mongodb java driver 3.0+ it could be:
public static void main(String[] args) {
try (MongoClient client = new MongoClient("127.0.0.1")) {
MongoCollection<Document> col = client.getDatabase("test").getCollection("test");
Document groupFields = new Document("_id", "$page");
groupFields.put("count", new Document("$sum", 1));
groupFields.put("impressions", new Document("$first", "$impressions"));
groupFields.put("position", new Document("$first", "$position"));
Document matchFields = new Document("count", 1);
Document projectFields = new Document("_id", 0);
projectFields.put("page", "$_id");
projectFields.put("impressions", 1);
projectFields.put("position", 1);
AggregateIterable<Document> output = col.aggregate(Arrays.asList(
new Document("$group", groupFields),
new Document("$match", matchFields),
new Document("$project", projectFields)
));
for (Document doc : output) {
System.out.println(doc);
}
}
}
Output for your db is:
Document{{impressions=1.0, position=7.0, page=www.pqr.com}}
Document{{impressions=7.0, position=4.0, page=www.xyz.com}}
You should be able to run an aggregation pipeline that groups the documents by the page field using the $group pipeline operator, get a count of the documents using the $sum operator and retain the other two fields using the $first (or $last) operator.
The preceding pipeline after the $group should be able to filter the grouped documents on the count field, i.e. filter out the duplicates from the result. Use the $match pipeline operator for such query.
A final cosmetic pipeline would involve the $project stage which reshapes each document in the stream, include, exclude or rename fields, inject computed fields, create sub-document fields, using mathematical expressions, dates, strings and/or logical (comparison, boolean, control) expressions.
Run this aggregation pipeline to get the desired result:
db.collection.aggregate([
{
"$group": {
"_id": "$page",
"count": { "$sum": 1 },
"impressions": { "$first": "$impressions" },
"position": { "$first": "$position" }
}
},
{ "$match": { "count": 1 } },
{
"$project": {
"_id": 0,
"page": "$_id",
"impressions": 1,
"position": 1
}
}
])
I'm having trouble creating aggregation in Morphia, the documentation is really not clear. This is the original query:
db.collection('events').aggregate([
{
$match: {
"identifier": {
$in: [
userId1, userId2
]
},
$or: [
{
"info.name": "messageType",
"info.value": "Push",
"timestamp": {
$gte: newDate("2015-04-27T19:53:13.912Z"),
$lte: newDate("2015-08-27T19:53:13.912Z")
}
}
]
}{
$unwind: "$info"
},
{
$match: {
$or: [
{
"info.name": "messageType",
"info.value": "Push"
}
]
}
]);
The only example in their docs was using out and there's some example here but I couldn't make it to work.
I didn't even made it past the first match, here's what I have:
ArrayList<String> ids = new ArrayList<>();
ids.add("199941");
ids.add("199951");
Query<Event> q = ads.getQueryFactory().createQuery(ads);
q.and(q.criteria("identifier").in(ids));
AggregationPipeline pipeline = ads.createAggregation(Event.class).match(q);
Iterator<Event> iterator = pipeline.aggregate(Event.class);
Some help or guidance and how to start with the query or how it works will be great.
You need to create the query for the match() pipeline by breaking your code down into manageable pieces that will be easy to follow. So let's start
with the query to match the identifier field, you have done the great so far. We need to then combine with the $or part of the query.
Carrying on from where you left, create the full query as:
Query<Event> q = ads.getQueryFactory().createQuery(ads);
Criteria[] arrayA = {
q.criteria("info.name").equal("messageType"),
q.criteria("info.value").equal("Push"),
q.field("timestamp").greaterThan(start);
q.field("timestamp").lessThan(end);
};
Criteria[] arrayB = {
q.criteria("info.name").equal("messageType"),
q.criteria("info.value").equal("Push")
};
q.and(
q.criteria("identifier").in(ids),
q.or(arrayA)
);
Query<Event> query = ads.getQueryFactory().createQuery(ads);
query.or(arrayB);
AggregationPipeline pipeline = ads.createAggregation(Event.class)
.match(q)
.unwind("info")
.match(query);
Iterator<Event> iterator = pipeline.aggregate(Event.class);
The above is untested but will guide you somewhere closer home, so make some necessary adjustments where appropriate. For some references, the following SO questions may give you some pointers:
Complex AND-OR query in Morphia
Morphia query with or operator
and of course the AggregationTest.java Github page
How can I sort a MongoDB collection by a given field, case-insensitively? By default, I get A-Z before a-z.
Update:
As of now mongodb have case insensitive indexes:
Users.find({})
.collation({locale: "en" })
.sort({name: 1})
.exec()
.then(...)
shell:
db.getCollection('users')
.find({})
.collation({'locale':'en'})
.sort({'firstName':1})
Update: This answer is out of date, 3.4 will have case insensitive indexes. Look to the JIRA for more information https://jira.mongodb.org/browse/SERVER-90
Unfortunately MongoDB does not yet have case insensitive indexes: https://jira.mongodb.org/browse/SERVER-90 and the task has been pushed back.
This means the only way to sort case insensitive currently is to actually create a specific "lower cased" field, copying the value (lower cased of course) of the sort field in question and sorting on that instead.
Sorting does work like that in MongoDB but you can do this on the fly with aggregate:
Take the following data:
{ "field" : "BBB" }
{ "field" : "aaa" }
{ "field" : "AAA" }
So with the following statement:
db.collection.aggregate([
{ "$project": {
"field": 1,
"insensitive": { "$toLower": "$field" }
}},
{ "$sort": { "insensitive": 1 } }
])
Would produce results like:
{
"field" : "aaa",
"insensitive" : "aaa"
},
{
"field" : "AAA",
"insensitive" : "aaa"
},
{
"field" : "BBB",
"insensitive" : "bbb"
}
The actual order of insertion would be maintained for any values resulting in the same key when converted.
This has been an issue for quite a long time on MongoDB JIRA, but it is solved now. Take a look at this release notes for detailed documentation. You should use collation.
User.find()
.collation({locale: "en" }) //or whatever collation you want
.sort({name:1})
.exec(function(err, users) {
// use your case insensitive sorted results
});
Adding the code .collation({'locale':'en'}) helped to solve my issue.
As of now (mongodb 4), you can do the following:
mongo shell:
db.getCollection('users')
.find({})
.collation({'locale':'en'})
.sort({'firstName':1});
mongoose:
Users.find({})
.collation({locale: "en" })
.sort({name: 1})
.exec()
.then(...)
Here are supported languages and locales by mongodb.
In Mongoose:-
Customer.find()
.collation({locale: "en" })
.sort({comapany: 1})
Here it is in Java. I mixed no-args and first key-val variants of BasicDBObject just for variety
DBCollection coll = db.getCollection("foo");
List<DBObject> pipe = new ArrayList<DBObject>();
DBObject prjflds = new BasicDBObject();
prjflds.put("field", 1);
prjflds.put("insensitive", new BasicDBObject("$toLower", "$field"));
DBObject project = new BasicDBObject();
project.put("$project", prjflds);
pipe.add(project);
DBObject sort = new BasicDBObject();
sort.put("$sort", new BasicDBObject("insensitive", 1));
pipe.add(sort);
AggregationOutput agg = coll.aggregate(pipe);
for (DBObject result : agg.results()) {
System.out.println(result);
}
If you want to sort and return all data in a document, you can add document: "$$ROOT"
db.collection.aggregate([
{
$project: {
field: 1,
insensitive: { $toLower: "$field" },
document: "$$ROOT"
}
},
{ $sort: { insensitive: 1 } }
]).toArray()
Tried all the above and answers
Consolidating the result
Answer-1:
db.collection.aggregate([
{ "$project": {
"field": 1,
"insensitive": { "$toLower": "$field" }
}},
{ "$sort": { "insensitive": 1 } } ])
Aggregate query converts the field into lower, So performance is low for large data.
Answer-2:
db.collection.find({}).collation({locale: "en"}).sort({"name":1})
By default mongo follows uft-8 encoding(Z has high piriority then a) rules ,So overriding with language-specific rules.
Its fast compare to above query
Look into an official document to customize rules
https://docs.mongodb.com/manual/reference/collation/
We solve this problem with the help of .sort function in JavaScript array
Here is the code
function foo() {
let results = collections.find({
_id: _id
}, {
fields: {
'username': 1,
}
}).fetch();
results.sort((a, b)=>{
var nameA = a.username.toUpperCase();
var nameB = b.username.toUpperCase();
if (nameA nameB) {
return 1;
}
return 0;
});
return results;
}
my mongo collections contains following documents
{
"_id" : ObjectId("52d43cd29b85346a4aa6fe17"),
"windowsServer" : [
{
"topProcess" : [ ]
}]
},
{
"_id" : ObjectId("52d43cd29b85346a4aa6fe18"),
"windowsServer" : [
{
"topProcess" : [ {pid:1,name:"wininit"}]
}]
}
Now in my java code I want to used only topProcess in above case I want only second document which topProcess having some data. For this I write my java code as below
BasicDBObject criteria = new BasicDBObject();
BasicDBObject projections = new BasicDBObject();
criteria.put("windowsServer.topProcess", new BasicDBObject("$ne", "[]"));
projections.put("windowsServer.topProcess",1);
DBCursor cur = coll.find(criteria,projections);
while(cur.hasNext() && !isStopped()) {
String json = cur.next().toString();
}
when I execute above code and print json string it also contains the both topProcess. Can any one knows how should I get only second documents topProcess?
Try this one (and translate it to your java driver):
"windowsServer.topProcess": {$not: {$size: 0} }
In your code, you only have mistake in the following line.
criteria.put("windowsServer.topProcess", new BasicDBObject("$ne", "[]"));
You try to compare if an array is empty by using brackets as a String. You can use BasicDBList() for an empty array. Update above line with the following and it should work.
criteria.put("windowsServer.topProcess", new BasicDBObject("$ne", new BasicDBList()));