MongoDB find with regex behaves differently from Java - java

Regex find with MongoDB 2.4.6 is not behaving the same way as the Java Pattern class does. Can anyone explain why?
Inserting data in MongoDB:
db.Project.insert({ "_id" : "e0b57d9e-744c-471e-ae95-22a389d2988d", "name" : "Project.20131106101344433" });
Finding all Projects:
db.Project.find()
{
"_id" : "e0b57d9e-744c-471e-ae95-22a389d2988d",
"name" : "Project.20131106101344433"
}
Finding all Projects whose name is "t":
db.Project.find({"name" : /t/})
{
"_id" : "e0b57d9e-744c-471e-ae95-22a389d2988d",
"name" : "Project.20131106101344433"
}
Checking that sole Project name does not match regex "t":
#Test
public void regex() {
assertTrue(!Pattern.matches("t", "Project.20131106101344433"));
}
As you see, the regex db.Project.find returns a Project whose name is not "t", but does contain "t". What am I missing?
Thanks!

In this case db.Project.find({"name" : /t/}) you are not looking for a document whose name is t, you are looking for every document whose name contains t. You can read about PECL here and test what are you doing here.
To find exact match you have to do {"name" : 't'}

Related

mongodb java driver pullByFilter

I have document schema such as
{
"_id" : 18,
"name" : "Verdell Sowinski",
"scores" : [
{
"type" : "exam",
"score" : 62.12870233109035
},
{
"type" : "quiz",
"score" : 84.74586220889356
},
{
"type" : "homework",
"score" : 81.58947824932574
},
{
"type" : "homework",
"score" : 69.09840625499065
}
]
}
I have a solution using pull that copes with removing a single element at a time but saw
I want to get a general solution that would cope with irregular schema where there would be between one and many elements to the array and I would like to remove all elements based on a condition.
I'm using mongodb driver 3.2.2 and saw this pullByFilter which sounded good
Creates an update that removes from an array all elements that match the given filter.
I tried this
Bson filter = and(eq("type", "homework"), lt("score", highest));
Bson u = Updates.pullByFilter(filter);
UpdateResult ur = collection.updateOne(studentDoc, u);
Unsurprisingly, this did not have any effect since I wasn't specifying the array scores
I get an error
The positional operator did not find the match needed from the query. Unexpanded update: scores.$.type
when I change the filter to be
Bson filter = and(eq("scores.$.type", "homework"), lt("scores.$.score", highest));
Is there a one step solution to this problem?
There seems very little info on this particular method I can find. This question may relate to How to Update Multiple Array Elements in mongodb
After some more "thinking" (and a little trial and error), I found the correct Filters method to wrap my basic filter. I think I was focusing on array operators too much.
I'll not post it here in case of flaming.
Clue: think "matches..." (as in regex pattern matching) when dealing with Filters helper methods ;)

Tell if a BasicMongoDBObject is valid from the .toString()?

I'd like to confirm that a parser I wrote is working correctly. It takes a JavaScript mongodb command that could be run from the terminal and converts it to a Java object for the MongoDB/Java drivers.
Is the following .toString() result valid?
{ "NumShares " : 1 , "attr4 " : 1 , "symbol" : { "$regex" : "ESLR%"}}
This was converted from the following JavaScript
db.STOCK.find({ "symbol": "ESLR%" }, { "NumShares" : 1, "attr4" : 1 })
And of course, the data as it rests in the collections
{ "_id" : { "$oid" : "538c99e41f12e5a479269ed1"} , "symbol" : "ESLR" , "NumShares" : 3471.0}
Thanks for all your help
You've combined the query document and the project document in that find() call in to one document. That's probably not what you want. But those documents are just json so you could use any parser to convert those. There's a few gotchas you'd have to deal with around ObjectIDs, dates, DBRefs, and particularly regular expressions but those can be managed without too much trouble by escaping/quoting them before parsing.

Mongo and Java: Create indexes for aggregation framework

Situation: I have collection with huge amount of documents after map reduce(aggregation). Documents in the collection looks like this:
/* 0 */
{
"_id" : {
"appId" : ObjectId("1"),
"timestamp" : ISODate("2014-04-12T00:00:00.000Z"),
"name" : "GameApp",
"user" : "test#mail.com",
"type" : "game"
},
"value" : {
"count" : 2
}
}
/* 1 */
{
"_id" : {
"appId" : ObjectId("2"),
"timestamp" : ISODate("2014-04-29T00:00:00.000Z"),
"name" : "ScannerApp",
"user" : "newUser#company.com",
"type" : "game"
},
"value" : {
"count" : 5
}
}
...
And I searching inside this collection with aggregation framework:
db.myCollection.aggregate([match, project, group, sort, skip, limit]); // aggregation can return result on Daily or Monthly time base depends of user search criteria, with pagination etc...
Possible search criteria:
1. {appId, timestamp, name, user, type}
2. {appId, timestamp}
3. {name, user}
I'm getting correct result, exactly what I need. But from optimisation point of view I have doubts about indexing.
Questions:
Is it possible to create indexes for such collection?
How I can create indexes for such object with complex _id field?
How I can do analog of db.collection.find().explain() to verify which index used?
And is good idea to index such collection or its my performance paranoia?
Answer summarisation:
MongoDB creates index by _id field automatically but that is useless in a case of complex _id field like in an example. For field like: _id: {name: "", timestamp: ""} you must use index like that: *.ensureIndex({"_id.name": 1, "_id.timestamp": 1}) only after that your collection will be indexed in proper way by _id field.
For tracking how your indexes works with Mongo Aggregation you can not use db.myCollection.aggregate().explain() and proper way of doing that is:
db.runCommand({
aggregate: "collection_name",
pipeline: [match, proj, group, sort, skip, limit],
explain: true
})
My testing on local computer sows that such indexing seems to be good idea. But this is require more testing with big collections.
First, indexes 1 and 3 are probably worth investigating. As for explain, you can pass explain as an option to your pipeline. You can find docs here and an example here

Ensuring index for nested repeating entities

I need to enforce unique constraint on a nested document, for example:
urlEntities: [
{ "url" : "http://t.co/ujBNNRWb0y" , "display_url" : "bit.ly/11JyiVp" , "expanded_url" :
"http://bit.ly/11JyiVp"} ,
{ "url" : "http://t.co/DeL6RiP8KR" , "display_url" : "ow.ly/i/2HC9x" ,
"expanded_url" : "http://ow.ly/i/2HC9x"}
]
url, display_url, and expaned_url need to be unique. How to issue ensureIndex command for this condition in MongoDB?
Also, is it a good design to have nested documents like this or should I move them to a separate collection and refer them from here inside urlEntities? I'm new to MongoDB, any best practices suggestion would be much helpful.
Full Scenario:
Say if I have a document as below in the db which has millions of data:
{ "_id" : { "$oid" : "51f72afa3893686e0c406e19"} , "user" : "test" , "urlEntities" : [ { "url" : "http://t.co/64HBcYmn9g" , "display_url" : "ow.ly/nqlkP" , "expanded_url" : "http://ow.ly/nqlkP"}] , "count" : 0}
When I get another document with similar urlEntities object, I need to update user and count fields only. First I thought of enforcing unique constraint on urlEntities fields and then handle exception and then go for an update, else if I check for each entry whether it exists before inserting, it will have significant impact on the performance. So, how can I enforce uniqueness in urlEntities? I tried
{"urlEntities.display_url":1,"urlEntities.expanded_url":1},{unique:true}
But still I'm able to insert the same document twice without exceptions.
Uniqueness is only enforced per document. You can not prevent the following (simplified from your example):
db.collection.ensureIndex( { 'urlEntities.url' : 1 } );
db.col.insert( {
_id: 42,
urlEntities: [
{
"url" : "http://t.co/ujBNNRWb0y"
},
{
"url" : "http://t.co/ujBNNRWb0y"
}
]
});
Similarily, you will have the same problem with a compound unique key for nested documents.
What you can do is the following:
db.collection.insert( {
_id: 43,
title: "This is an example",
} );
db.collection.update(
{ _id: 43 },
{
'$addToSet': {
urlEntities: {
"url" : "http://t.co/ujBNNRWb0y" ,
"display_url" : "bit.ly/11JyiVp" ,
"expanded_url" : "http://bit.ly/11JyiVp"
}
}
}
);
Now you have the document with _id 43 with one urlEntities document. If you run the same update query again, it will not add a new array element, because the full combination of url, display_url and expanded_url already exists.
Also, have a look at the $addToSet query operator's examples: http://docs.mongodb.org/manual/reference/operator/addToSet/
for indexes on nested documents read this.
regarding the second part (nested documents best practices) - it really depends on your business logic and queries. if those nested documents don't make sense as first class entities, meaning you won't be searching for them directly but only in the context of their parent document then having them nested make sense. otherwise you should consider extracting them out.
i think that there isn't absolute answer to your question. read the chapter about indexing... it helped me a lot.

Modify nest document's value in MongoDB for Java

A very quick question, how am I going to do this below:
> db.blog.posts.findOne()
{
"_id" : ObjectId("4b253b067525f35f94b60a31"),
"title" : "A Blog Post",
"content" : "...",
"author" : {
"name" : "joe",
"email" : "joe#example.com"
}
}
I saw the answer in Javascript is like:
> db.blog.posts.update({"author.name" : "joe"}, {"$set" : {"author.name" : "joe schmoe"}})
But how am I going to do that in Java?
If I have a very deep level value has to be changed, am I supposed to use this way? like: "person.abc.xyz.name.address" ?
Using dot notation to access nested documents will work perfectly well in the Java Driver. Take a look at this StackOverflow answer:
MongoDB nested documents searching
For the Java Driver, the basic idea is to replace the Javascript objects with instances of BasicDBObject.
Here's another good reference for updating:
MongoDb's $set equivalent in its java Driver

Categories