Slow queries for MongoDB regular expression text in Java

Slow queries for MongoDB regular expression text in Java - java

When searching for regular expression text in MongoDB, the speed is slow at first, so I would like to know the cause.
Only on the phenomenon JAVA Application Server will the corresponding slow query be found.
When the corresponding query is run in the MongoDB shell, it works very fast (index works well).
The number of data result values in the above query is five.
The total number of data in the collection is 450,000
Below is a process-specific query.
=====JAVA Process=====
(Very Slow 5,518ms)
public List<Contents> findContentList(int rowCnt, long rowNo, String searchContent){
Query query = new Query();
query.addCriteria((Criteria.where(DictionaryKey.content).regex("^" + searchContent)));
if (rowNo > 0) query.addCriteria(Criteria.where(DictionaryKey.contentSeq).gt(rowNo));
query.with(new Sort(Sort.Direction.ASC, DictionaryKey.contentSeq));
query.limit(rowCnt);
return this.mongoTemplate.find(query, Contents.class, Constant.CollectionName.Contents);
}
java Monitoring tool
Query : Query: { "content" : { "$regex" : "^abcd"}}, Sort: { "contentSeq" : 1}
Collection Name : contents
MongTemplate#find() [5,518ms] -- org.springframework.data.mongodb.core.mongTemplate.find()Ljava/util/List;
=====Mongodb Shell======
Mongodb query (Very Fast, index works well)
db.contents.found ({content:{"$regex" : "^abcd"}}).sort ({"contentSeq" : 1});
'contents' collection index is content_1_contentSeq_1
Please help me.

I found the reason
The cause is that some queries are using the wrong index.
The solution was to force the use of the index by giving a hint.

Related

MongoDB Query Issues

I have a collection in mongodb which has many documents. Using Studio 3T I can see the document looks like below
{
"DialectType" : "ORACLE",
"DomainName" : "NewDomain"
}
There are many of this type with different values but with same keys. I am using below code to query the documents-
Query query = Query.query(Criteria.where("DialectType").is("ORACLE"));
mongoOperations.find(query, DialectTypeCollection.class, "my_collection_name");
The above query does not return records. I am not sure what is the issue. Any help is appreciated.

MongoDB - How to get the count for a find query

I cannot for the life of me find out how to get a count for a find query using the java driver in mongo db. Can someone please put me out of my misery?
I have the following:
MongoCursor<Document> findRes = collection.find().iterator();
But there is no count method that I can find anywhere.

public Long getTotalCount(String collectionName, Document filterDocument) {
MongoCollection collection = database.getCollection(collectionName);
return filterDocument != null ? collection.count(filterDocument) : collection.count();
}
Where filterDocument is org.bson.Document with filter criterias or null if you want to get total count
You may also use more powerful Filters class. Example: collection.count(Filters.and(Filters.eq("field","value"),second condition and so on));
So, in order to be able to take both Document and Filters as param you may change signature to public Long getTotalCount(String collectionName, Bson filterDocument) {

long rows = db.getCollection(myCollection).count(new Document("_id", 10)) ;
this is in Java, myCollection is collection name.

MongoDB has inbuilt method count() that can be called on cursor to find the number of documents returned.
I tried following piece of code in mongodb, that worked well, can be easily applied in java or any other language too:
var findres = db.c.find()
findres.count() gave output 29353

cursor.count() is what you're looking for I believe. Your find query returns a Cursor so you can just call count() on that.

MongoDB + Morphia - full text search using AND instead of OR

I've setup full text search and MongoDB and it's working quite well (Mongo 2.6.5).
However it does an OR instead of and AND.
1) Is it possible to make the query an AND query, while still getting all the benefits of full text search (stemming etc.)
2) And if so, is it possible to add this option via the Morphia wrapper library
EDIT
I see that the full text search includes a 'score' for each document returned. Is it possible to only return docs with a certain score or above. Is there some score that would represent a 'fuzzy' and query. That is usually all tokens are in the document but not absolutely always. If so this would solve the problem as well.
Naturally if possible to do this via Morphia that would be super helpful. But I can use the native java driver as well.
Any pointers in the correct direction, much appreciated.
EDIT
Code looks like this, I'm using Morphia 1.0.1:
Datastore ds = Dao.instance().getDatabase();
Query<Product> q = ds.createQuery(Product.class).search("grey vests");
List<Product> prods = q.asList();
Printing the query gives:
{ "$text" : { "$search" : "grey vests"}}
Note: I am able to do take an intersection of multiple result sets to create an AND query. However this is very slow since something like "grey" will return a massive result set and be slow at feeding the results back.
EDIT
I've tried to chain the search() calls and add a single 'token' to each call. But I am getting a run time error. Code becomes:
q.search("grey").search("vests");
The query I get is (which seems like it's doing the right thing) ...
{ "$and" : [ { "$text" : { "$search" : "grey"}} , { "$text" : { "$search" : "vests"}}]}
The error is:
com.mongodb.MongoQueryException: Query failed with error code 17287 and error message 'Can't canonicalize query: BadValue Too many text expressions' on server ...
at com.mongodb.connection.ProtocolHelper.getQueryFailureException(ProtocolHelper.java:93)

Why are upserts so slow for MongoDB Java API?

Using Mongo Java Driver 2.13 and Mongo 3.0.
I am trying to move from Spring Data save() to MongoDB API's Bulk Writing since I am saving/updating about 100K objects. I am trying to write the Service/Repository layer code where I can pass in a Collection of my specific Objects and be able to either create new records or update existing records, or in other words upsert. When I do an insert the performance is very acceptable.
If I update the code to do upserts the performance is just way too slow. Am I doing something wrong in the following code sample (note it is scaled down to just the necessary logic, i.e. no error handling):
public void save(Collection<MyDomainObject> objects) {
BulkWriteOperation bulkWriter = dbCollection.initializeUnorderedBulkOperation();
for(MyDomainObject mdo : objects) {
DBObject dbObject = convert(mdo);
bulkWriter.find(new BasicDBObject("id",mdo.getId()))
.upsert().updateOne(new BasicDBObject("$set",dbObject));
}
bulkWriter.execute(writeConcern);
}
Note that I also tried replaceOne() instead of updateOne() with the same results.
I also noticed in the Mongo log that "nscannedObjects" keeps increasing while "nMatched", "nModified" and "upsert" are never larger than 1. Does this mean that it is table scanning for each record?
Am I using upsert the correct way? Any other suggestions?

Thanks to ry_donahue I figured out the issue.
It was not using the correct ID field, which is the index. In the conversion of the domain object to a DBObject there ended up being an "id" and an "_id" field.
I also changed updateOne() to replaceOne(). So now the code looks like this:
public void save(Collection<MyDomainObject> objects) {
BulkWriteOperation bulkWriter = dbCollection.initializeUnorderedBulkOperation();
for(MyDomainObject mdo : objects) {
DBObject dbObject = convert(mdo);
bulkWriter.find(new BasicDBObject("_id",new ObjectId(mdo.getId()))).upsert().replaceOne(dbObject);
}
bulkWriter.execute(writeConcern);
}
This now gives very good performance.

Determine which parameter failed in a Lucene BooleanQuery?

I need to determine which part of a Lucene BooleanQuery failed if the entire query returns no results.
I'm using a BooleanQuery made up of 4 NumericRangeQueries and a PhraseQuery. Each is added to the query with Occur.MUST.
If I don't get any results for a query, is there a way to tell which part of the query failed to match anything? Do I need to run queries individually and compare results to get the one that failed?
Edit - Added PhraseQuery code.
if( row.getPropertykey_tx() != null && !row.getPropertykey_tx().trim().isEmpty()){
PhraseQuery pQuery = new PhraseQuery();
String[] words = row.getPropertykey_tx().trim().split(" ");
for( String word : words ){
pQuery.add(new Term(TitleRecordColumns.SA_SITE_ADDR.toString(), word));
}
pQuery.setSlop(2);
topBQuery.add(pQuery, BooleanClause.Occur.MUST);
}

Running individual parts of the query is probably the simplest approach, to my mind.
Another tool available is the getting an Explaination. You can call IndexSearcher.explain to get an Explanation of the scoring for the query against a particular document. If you can provide the docid of a document you believe should match the query, you can analyze Explanation.toString (or toHtml, if you prefer) to determine which subqueries are not matching against it.
If you want to automatically keep a record of which clause of a BooleanQuery doesn't produce results, I believe you will need to run each query independantly. If you no longer have access to the subqueries used to create it, you can get the clauses of it instead:
findTroublesomeQuery(BooleanQuery query) {
for (BooleanClause clause : query.clauses()) {
Query subquery = clause.getQuery()
TopDocs docs = searchHoweverYouDo(subquery);
if (doc.totalSize == 0) {
//If you want to dig down recursively...
if (subquery instanceof BooleanQuery)
findTroublesomeQuery(query);
else
log(query); //Or do whatever you want to keep track of it.
}
}
}
DisjunctionMaxQuery is a commonly used query that wraps multiple subqueries as well, so might be worth considering for this sort of approach.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Slow queries for MongoDB regular expression text in Java - java

I found the reason The cause is that some queries are using the wrong index. The solution was to force the use of the index by giving a hint.

Related

MongoDB Query Issues

MongoDB - How to get the count for a find query

MongoDB + Morphia - full text search using AND instead of OR

Why are upserts so slow for MongoDB Java API?

Determine which parameter failed in a Lucene BooleanQuery?

Categories

Resources