MongoDB + Morphia - full text search using AND instead of OR - java

I've setup full text search and MongoDB and it's working quite well (Mongo 2.6.5).
However it does an OR instead of and AND.
1) Is it possible to make the query an AND query, while still getting all the benefits of full text search (stemming etc.)
2) And if so, is it possible to add this option via the Morphia wrapper library
EDIT
I see that the full text search includes a 'score' for each document returned. Is it possible to only return docs with a certain score or above. Is there some score that would represent a 'fuzzy' and query. That is usually all tokens are in the document but not absolutely always. If so this would solve the problem as well.
Naturally if possible to do this via Morphia that would be super helpful. But I can use the native java driver as well.
Any pointers in the correct direction, much appreciated.
EDIT
Code looks like this, I'm using Morphia 1.0.1:
Datastore ds = Dao.instance().getDatabase();
Query<Product> q = ds.createQuery(Product.class).search("grey vests");
List<Product> prods = q.asList();
Printing the query gives:
{ "$text" : { "$search" : "grey vests"}}
Note: I am able to do take an intersection of multiple result sets to create an AND query. However this is very slow since something like "grey" will return a massive result set and be slow at feeding the results back.
EDIT
I've tried to chain the search() calls and add a single 'token' to each call. But I am getting a run time error. Code becomes:
q.search("grey").search("vests");
The query I get is (which seems like it's doing the right thing) ...
{ "$and" : [ { "$text" : { "$search" : "grey"}} , { "$text" : { "$search" : "vests"}}]}
The error is:
com.mongodb.MongoQueryException: Query failed with error code 17287 and error message 'Can't canonicalize query: BadValue Too many text expressions' on server ...
at com.mongodb.connection.ProtocolHelper.getQueryFailureException(ProtocolHelper.java:93)

Related

Slow queries for MongoDB regular expression text in Java

When searching for regular expression text in MongoDB, the speed is slow at first, so I would like to know the cause.
Only on the phenomenon JAVA Application Server will the corresponding slow query be found.
When the corresponding query is run in the MongoDB shell, it works very fast (index works well).
The number of data result values in the above query is five.
The total number of data in the collection is 450,000
Below is a process-specific query.
=====JAVA Process=====
(Very Slow 5,518ms)
public List<Contents> findContentList(int rowCnt, long rowNo, String searchContent){
Query query = new Query();
query.addCriteria((Criteria.where(DictionaryKey.content).regex("^" + searchContent)));
if (rowNo > 0) query.addCriteria(Criteria.where(DictionaryKey.contentSeq).gt(rowNo));
query.with(new Sort(Sort.Direction.ASC, DictionaryKey.contentSeq));
query.limit(rowCnt);
return this.mongoTemplate.find(query, Contents.class, Constant.CollectionName.Contents);
}
java Monitoring tool
Query : Query: { "content" : { "$regex" : "^abcd"}}, Sort: { "contentSeq" : 1}
Collection Name : contents
MongTemplate#find() [5,518ms] -- org.springframework.data.mongodb.core.mongTemplate.find()Ljava/util/List;
=====Mongodb Shell======
Mongodb query (Very Fast, index works well)
db.contents.found ({content:{"$regex" : "^abcd"}}).sort ({"contentSeq" : 1});
'contents' collection index is content_1_contentSeq_1
Please help me.
I found the reason
The cause is that some queries are using the wrong index.
The solution was to force the use of the index by giving a hint.

How to get error message from the scala/java MongoDB api

I'm using Casbah (mongodb scala library). I have an insert that doesn't work.
val builder = MongoDBObject.newBuilder
builder += "_id" -> token.uuid
builder += "email" -> token.email
builder += "creationTime" -> token.creationTime
builder += "expirationTime" -> token.expirationTime
builder += "isSignUp" -> token.isSignUp
val writeResult = mycollection += (builder.result)
If I change this for something simpler (like, a simple {"hello": "world"} document), the insert is done. So I know there's something that doesn't work with this particular insert. However, I find no way to know why. I'd like to get some feedback from Mongo or from Casbah.
However the WriteResult class, which apparently comes directly from the Java MongoDB driver, seems very opaque: http://api.mongodb.com/java/3.0/com/mongodb/WriteResult.html
How can I get some info about why an insert is failing? I'm not asking about this particular insert. Just, in general, how can I get info about the error that caused an insert to fail?
Thanks for your help.
Casbah is a Scala wrapper over the Java MongoDB driver.
mycollection += (builder.result)
is translated into
mycollection.save(builder.result)
If the operation had an error it will throw an exception like described here.
The WriteResult containing information about the write if no error happened.
I would check:
getN and isUpdateOfExisting values in WriteResult because save is doing either update or insert (read more here).
wasAcknowledged value in WriteResult to make sure you get the exception and you don't have the WriteConcern set to UNACKNOWLEDGED.

How to use WHERE NOT clause in Firebase? [duplicate]

I am using Firebase database with a Json structure to manage users' comments.
{
"post-comments" : {
"post-id-1" : {
"comment-id-11" : {
"author" : "user1",
"text" : "Hello world",
"uid" : "user-id-2"
},....
}
I would like to pull all the comments but excluding the current user's one.
In SQL this will be translated into:
Select * from post-comments where id !="user-id-2"
I understand that Firebase database does not offer a way to excludes nodes based on the presence of a value (ie: user id != ...).
Thus is there any alternative solutions to tackle this. Either by changing the Database structure, of maybe by processing the datasource once the data are loaded.
For the latter I am using a FirebaseTableViewDataSource. is there a way to filter the data after the query?
Thanks a lot
The first solution is to load the comments via .ChildAdded and ignore the ones with the current user_id
let commentsRef = self.myRootRef.childByAppendingPath("comments")
commentsRef.observeEventType(.ChildAdded, withBlock: { snapshot in
let uid = snapshot.value["uid"] as! String
if uid != current_uid {
//do stuff
}
})
You could expand on this and load everything by .Value and iterate over the children in code as well. That method will depend on how many nodes you are loading - .ChildAdded will be lower memory usage.

Why are upserts so slow for MongoDB Java API?

Using Mongo Java Driver 2.13 and Mongo 3.0.
I am trying to move from Spring Data save() to MongoDB API's Bulk Writing since I am saving/updating about 100K objects. I am trying to write the Service/Repository layer code where I can pass in a Collection of my specific Objects and be able to either create new records or update existing records, or in other words upsert. When I do an insert the performance is very acceptable.
If I update the code to do upserts the performance is just way too slow. Am I doing something wrong in the following code sample (note it is scaled down to just the necessary logic, i.e. no error handling):
public void save(Collection<MyDomainObject> objects) {
BulkWriteOperation bulkWriter = dbCollection.initializeUnorderedBulkOperation();
for(MyDomainObject mdo : objects) {
DBObject dbObject = convert(mdo);
bulkWriter.find(new BasicDBObject("id",mdo.getId()))
.upsert().updateOne(new BasicDBObject("$set",dbObject));
}
bulkWriter.execute(writeConcern);
}
Note that I also tried replaceOne() instead of updateOne() with the same results.
I also noticed in the Mongo log that "nscannedObjects" keeps increasing while "nMatched", "nModified" and "upsert" are never larger than 1. Does this mean that it is table scanning for each record?
Am I using upsert the correct way? Any other suggestions?
Thanks to ry_donahue I figured out the issue.
It was not using the correct ID field, which is the index. In the conversion of the domain object to a DBObject there ended up being an "id" and an "_id" field.
I also changed updateOne() to replaceOne(). So now the code looks like this:
public void save(Collection<MyDomainObject> objects) {
BulkWriteOperation bulkWriter = dbCollection.initializeUnorderedBulkOperation();
for(MyDomainObject mdo : objects) {
DBObject dbObject = convert(mdo);
bulkWriter.find(new BasicDBObject("_id",new ObjectId(mdo.getId()))).upsert().replaceOne(dbObject);
}
bulkWriter.execute(writeConcern);
}
This now gives very good performance.

Unexpected MongoDB "OR" query behaviour

I'm testing out spring-data and it's mongodb support.
I have a question about the query creation when using or-queries. Consider the following:
Query query = new Query().or(new Query(where("receiverId").is(userId)), new Query(where("requesterId").is(userId)));
query.and(where("status").is(status));
This will result in the following mongodb query:
"$or" : [ { "receiverId" : { "$oid" : "4d78696025d0d46b42d9c579"}} , { "requesterId" : { "$oid" : "4d78696025d0d46b42d9c579"}}] , "status" : "REQUESTED"}
This returns zero results while one is expected. Running this query in mongodb command results in following error:
error: { "$err" : "invalid operator: $oid", "code" : 10068 }
Modifying the query and running it in mongodb command works fine:
{ "$or" : [ { "receiverId" : ObjectId("4d78696025d0d46b42d9c579")} , { "requesterId" : ObjectId("4d78696025d0d46b42d9c579")}] , "status" : "REQUESTED"}
Notice the use of ObjectId("...") instead of $oid.
Am I going about something the wrong way? Maybe setting up the query wrong?
Are you inspecting that query variable at runtime or is that what you are seeing in MongoDB's logs?
Int he C# driver, if you inspect the query variable, you see $oid as well, but that is not the actual query that is sent to the server. At some point, it changes that to a valid MongoDB query.
If you are running on linux, you may want to start up mongosniff which will show you realtime queries, updates and inserts as they happen. If you are on Windows, you should start up mongod.exe with -vvvv flag which will enable it to log every query, update, insert, or command to the log file.
Then you can actually see the exact query that is being submitted.

Categories