I am using java and mongoDB. I have the following json,
{ "_id" : { "$oid" : "524a27a318c533dc95edafe1"} , "RoomNumber" : 516 , "RoomType" : "presidential" , "Reserved" : true , "RegularRate" : 400.0 , "Discount" : [ 0.85 , 0.75 , 1.0 , 1.0] , "DiscountedRate" : 0}
{ "_id" : { "$oid" : "524a27a318c533dc95edafe2"} , "RoomNumber" : 602 , "RoomType" : "presidential" , "Reserved" : false , "RegularRate" : 500.0 , "Discount" : [ 1 , 0.75 , 1.0 , 1.0] , "DiscountedRate" : 0}
{ "_id" : { "$oid" : "524a27a318c533dc95edafe3"} , "RoomNumber" : 1315 , "RoomType" : "Single" , "Reserved" : true , "RegularRate" : 100.0 , "Discount" : [ 1 , 1 , 1.0 , 1.0] , "DiscountedRate" : 0}
In the collection the documents have different room numbers. If I know a room number how would i get a document with that room number and then get all ther other values in that document.
for exmaple if I have 602, i want to be able to get roomtype: presidential, reserved: false, Regular rate: 500
Thanks
The shell query for that would be:
db.rooms.findOne({RoomNumber: 602}, {RoomType: 1, Reserved: 1, RegularRate:1, _id: 0});
And using the native mongo driver:
BasicDBObject query = new BasicDBObject("RoomNumber", 602);
BasicDBObject projection = new BasicDBObject("RoomType", 1);
projection
.append("Reserved", 1)
.append("RegularRate", 1)
.append("_id", 0);
DBObject result = collection.findOne(query, projection);
Related
I have a mongdb database with the following stats:
{
"ns" : "pourmoi.featurecount",
"count" : 12152142,
"size" : 1361391264,
"avgObjSize" : 112,
"numExtents" : 19,
"storageSize" : 1580150784,
"lastExtentSize" : 415174656,
"paddingFactor" : 1,
"paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0.
It remains hard coded to 1.0 for compatibility only.",
"userFlags" : 1,
"capped" : false,
"nindexes" : 2,
"totalIndexSize" : 1165210816,
"indexSizes" : {
"_id_" : 690111632,
"feature_1" : 475099184
},
"ok" : 1
}
and a java program that does a find and returns around 50 results.
This is the query
db.featurecount.find(
{ "$or" : [ { "feature" : "hello"}, { "feature" : "how"},
{ "feature" : "are"} , { "feature" : "you"} ]})
.sort({count: -1}).limit(20)
This query takes around 30 seconds (at least)... Is it possible to make it run faster?
PS: The mongodb server is in localhost...
I am trying to sort cursor by two fields "start" and "end". Both of them have indexes.
This is the code attempting to sort.
DBCursor cursor = store.colConcepts.find(q);
cursor.addOption(Bytes.QUERYOPTION_NOTIMEOUT);
BasicDBObject sortObj = new BasicDBObject( "start", filter.isEventTimeSortDirAscending() ? 1 : -1 ).append( "end", filter.isEventTimeSortDirAscending() ? 1 : -1 );
cursor = cursor.sort( sortObj );
In above code query q is { "tags" : { "$all" : [ "Person"]}}
And following are the indices on collection store.colConcepts.
colConcepts.ensureIndex(new BasicDBObject("tags", 1));
colConcepts.ensureIndex(new BasicDBObject("roles.concept",1));
colConcepts.ensureIndex(new BasicDBObject("keys",1));
colConcepts.ensureIndex(new BasicDBObject("start", 1));
colConcepts.ensureIndex(new BasicDBObject("end", 1));
Following is the result of cursor.explain().
{ "cursor" : "BtreeCursor tags_1" , "isMultiKey" : true , "n" : 237267 , "nscannedObjects" : 237267 , "nscanned" : 237267 , "nscannedObjectsAllPlans" : 237267 , "nscannedAllPlans" :
237267 , "scanAndOrder" : false , "indexOnly" : false , "nYields" : 1853 , "nChunkSkips" : 0 , "millis" : 274 , "indexBounds" : { "tags" : [ [ "Person" , "Person"]]} , "allPlans" : [
{ "cursor" : "BtreeCursor tags_1" , "isMultiKey" : true , "n" : 237267 , "nscannedObjects" : 237267 , "nscanned" : 237267 , "scanAndOrder" : false , "indexOnly" : false ,
"nChunkSkips" : 0 , "indexBounds" : { "tags" : [ [ "Person" , "Person"]]}}] , "server" : "xxx:27017" , "filterSet" : false , "stats" : { "type" : "FETCH" , "works" : 237269 ,
"yields" : 1853 , "unyields" : 1853 , "invalidates" : 0 , "advanced" : 237267 , "needTime" : 1 , "needFetch" : 0 , "isEOF" : 1 , "alreadyHasObj" : 0 , "forcedFetches" : 0 ,
"matchTested" : 0 , "children" : [ { "type" : "IXSCAN" , "works" : 237268 , "yields" : 1853 , "unyields" : 1853 , "invalidates" : 0 , "advanced" : 237267 , "needTime" : 1 ,
"needFetch" : 0 , "isEOF" : 1 , "keyPattern" : "{ tags: 1 }" , "isMultiKey" : 1 , "boundsVerbose" : "field #0['tags']: [\"Person\", \"Person\"]" , "yieldMovedCursor" : 0 ,
"dupsTested" : 237267 , "dupsDropped" : 0 , "seenInvalidated" : 0 , "matchTested" : 0 , "keysExamined" : 237267 , "children" : [ ]}]}}
As you can see tags,start, end all of them have indices.
Upon execution it is producing the exception :
com.mongodb.MongoException: Runner error: Overflow sort stage buffered data usage of 33554442 bytes exceeds internal limit of 33554432 bytes
I did some research on the issue and found that this problem can come up if you have no index on the field. or if the fields are indexed as sparse which is not the case in situation I have.
I am using mongodb 2.6.1. I did run the code with 2.6.4 but that didn't stop mongo from throwing exception.
Any idea how this can be solved?
You don't have the right index for the query. The query planner selected the index on tags to fulfill the query, but that index doesn't help with the sort. Since you want to select based on tags and then sort on (start, end), try putting an index on { "tags" : 1, "start" : 1, "end" : 1 }. Having indexes on each separately isn't helpful here.
I have a sharded collection containing flight information. The schema looks something like:
{
"_id" : ObjectId("537ef1bb5516dd401b5b109a"),
"departureAirport" : "HAJ",
"arrivalAirport" : "AYT",
"departureDate" : NumberLong("1412553600000"),
"operatingAirlineCode" : "DE",
"operatingFlightNumber" : "1808",
"flightClass" : "P",
"fareType" : "EX",
"availability" : "*"
}
Here are the statistics of my collection:
{
"sharded" : true,
"systemFlags" : 1,
"userFlags" : 1,
"ns" : "flights.flight",
"count" : 2809822,
"numExtents" : 30,
"size" : 674357280,
"storageSize" : 921788416,
"totalIndexSize" : 287746144,
"indexSizes" : {
"_id_" : 103499984,"departureAirport_1_arrivalAirport_1_departureDate_1_flightClass_1_availability_1_fareType_1" : 184246160
},
"avgObjSize" : 240,
"nindexes" : 2,
"nchunks" : 869,
"shards" : {
"shard0000" : {
"ns" : "flights.flight",
"count" : 1396165,
"size" : 335079600,
"avgObjSize" : 240,
"storageSize" : 460894208,
"numExtents" : 15,
"nindexes" : 2,
"lastExtentSize" : 124993536,
"paddingFactor" : 1,
"systemFlags" : 1,
"userFlags" : 1,
"totalIndexSize" : 144633440,
"indexSizes" : {
"_id_" : 53094944,"departureAirport_1_arrivalAirport_1_departureDate_1_flightClass_1_availability_1_fareType_1" : 91538496
},
"ok" : 1
},
"shard0001" : {
"ns" : "flights.flight",
"count" : 1413657,
"size" : 339277680,
"avgObjSize" : 240,
"storageSize" : 460894208,
"numExtents" : 15,
"nindexes" : 2,
"lastExtentSize" : 124993536,
"paddingFactor" : 1,
"systemFlags" : 1,
"userFlags" : 1,
"totalIndexSize" : 143112704,
"indexSizes" : {
"_id_" : 50405040,"departureAirport_1_arrivalAirport_1_departureDate_1_flightClass_1_availability_1_fareType_1" : 92707664
},
"ok" : 1
}
},
"ok" : 1
}
I now run the queries from JAVA which look like:
{
"departureAirport" : "BSL",
"arrivalAirport" : "SMF",
"departureDate" : {
"$gte" : 1402617600000,
"$lte" : 1403136000000
},
"flightClass" : "C",
"$or" : [
{ "availability" : { "$gte" : "3"}},
{ "availability" : "*"}
] ,
"fareType" : "OW"
}
The departureDate should be in between a range of a week and availability should be greater than the requested number or '*'.
My question is what can i do to increase my performance. When I query the database with 50 connections per host I only get about 1000 ops/s but I need to get something about 3000 - 5000 ops/s.
The cursor looks okay when I run the query in the shell:
"cursor" : "BtreeCursor departureAirport_1_arrivalAirport_1_departureDate_1_flightClass_1_availability_1_fareType_1"
If I forgot something please write me. Thanks in advance.
The fact that a BtreeCursor is used doesn't make the query OK. The output of explain would help to identify the issue.
I guess a key problem is the order of your query params:
// equality, good
"departureAirport" : "BSL",
// equality, good
"arrivalAirport" : "SMF",
// range, bad because index based range queries should be near the end
// of contiguous index-based equality checks
"departureDate" : {
"$gte" : 1402617600000,
"$lte" : 1403136000000
},
// what is this, and how many possible values does it have? Seems to be
// a low selectivity index -> remove from index and move to end
"flightClass" : "C",
// costly $or, one op. is a range query, the other one equality...
// Simply set 'availability' to a magic number instead. That's
// ugly, but optimizations are ugly and it's unlikely we see planes with
// over e.g. 900,000 seats in the next couple of decades...
"$or" : [
{ "availability" : { "$gte" : "3"}},
{ "availability" : "*"}
] ,
// again, looks like low selectivity to me. Since it's already at the end,
// that's ok. I'd try to remove it from the index, however.
"fareType" : "OW"
You might want to change your index to something like
"departureAirport_1_arrivalAirport_1_departureDate_1_availability_1"
and query in that exact same order. Append everything else behind, so scans must be made only on those documents that matched all the other criteria in the index.
I'm assuming that flightClass and fareType have low selectivity. If that is not true, this won't be the best solution.
{ "_id" : { "$oid" : "52568424036439f2c5107571"} , "start" : { "x:" : 71 , "y:" : 9} , "end" : { "x:" : 30 , "y:" : 84}}
{ "_id" : { "$oid" : "52568424036439f2c5107572"} , "start" : { "x:" : 28 , "y:" : 59} , "end" : { "x:" : 72 , "y:" : 64}}
{ "_id" : { "$oid" : "52568424036439f2c5107573"} , "start" : { "x:" : 16 , "y:" : 71} , "end" : { "x:" : 18 , "y:" : 79}}
Need all docs say where start.x > 40
Query tried mongoDb console :
db.lines.find({"start.x": {$gt:40} })
Java Driver:
DBObject query = QueryBuilder.start("start.x").greaterThan(50).get();
collection.find(query)
Both cases no error, no documents retrieved.
You named your field x: not x so this should do the work:
db.lines.find({"start.x:": {$gt:40} })
Haven't used Mongo with Java, but... have you tried using QueryBuilder.get instead of QueryBuild.start? That's the method used for find operations.
You can also use a more specific one for this case: QueryBuild.greaterThan
I am new to MongoDB, and I have to create simple site using jsp/servlet.
I need to a create query that will return count of how many times some site has been visited.
My DB looks like this:
{ "_id" : { "$oid" : "5117fa92f1d3a4093d0d3902"} , "ip" : "127.0.0.1" , "datum" : { "$date" : "2013-02-10T19:52:50.051Z"} , "odlaznaStr" : "localhost:8080/mongoProjekat/" , "dolaznaStr" : "localhost:8080/mongoProjekat/treca"}<br>
{ "_id" : { "$oid" : "5117fa92f1d3a4093d0d3903"} , "ip" : "127.0.0.1" , "datum" : { "$date" : "2013-02-10T19:52:50.796Z"} , "odlaznaStr" : "localhost:8080/mongoProjekat/treca.jsp" , "dolaznaStr" : "localhost:8080/mongoProjekat/peta"}<br>
{ "_id" : { "$oid" : "5117fa93f1d3a4093d0d3904"} , "ip" : "127.0.0.1" , "datum" : { "$date" : "2013-02-10T19:52:51.141Z"} , "odlaznaStr" : "localhost:8080/mongoProjekat/peta.jsp" , "dolaznaStr" : "localhost:8080/mongoProjekat/treca"}<br>
{ "_id" : { "$oid" : "5117fa93f1d3a4093d0d3905"} , "ip" : "127.0.0.1" , "datum" : { "$date" : "2013-02-10T19:52:51.908Z"} , "odlaznaStr" : "localhost:8080/mongoProjekat/treca.jsp" , "dolaznaStr" : "localhost:8080/mongoProjekat/cetvrta"}<br>
{ "_id" : { "$oid" : "5117fa94f1d3a4093d0d3906"} , "ip" : "127.0.0.1" , "datum" : { "$date" : "2013-02-10T19:52:52.035Z"} , "odlaznaStr" : "localhost:8080/mongoProjekat/treca.jsp" , "dolaznaStr" : "localhost:8080/mongoProjekat/cetvrta"}<br>
{ "_id" : { "$oid" : "5117fa94f1d3a4093d0d3907"} , "ip" : "127.0.0.1" , "datum" : { "$date" : "2013-02-10T19:52:52.197Z"} , "odlaznaStr" : "localhost:8080/mongoProjekat/cetvrta.jsp" , "dolaznaStr" : "localhost:8080/mongoProjekat/treca"}
What I need is a result that will look something like this:
page: localhost:8080/mongoProjekat/treca visited: n(times)<br>
page: localhost:8080/mongoProjekat/druga visited: n(times)
...and so on for every page that has been visited.
I am using Java, by the way.
You may find this SQL to MongoDB chart http://docs.mongodb.org/manual/reference/sql-aggregation-comparison/ helpful.
As for your immediate question:
// getCollection
DBCollection myColl = db.getCollection("toskebre");
// for the $group operator
// note - the collection still has the field name "dolaznaStr"
// but, to we access "dolaznaStr" in the aggregation command,
// we add a $ sign in the BasicDBObject
DBObject groupFields = new BasicDBObject( "_id", "$dolaznaStr");
// we use the $sum operator to increment the "count"
// for each unique dolaznaStr
groupFields.put("count", new BasicDBObject( "$sum", 1));
DBObject group = new BasicDBObject("$group", groupFields );
// You can add a sort to order by count descending
DBObject sortFields = new BasicDBObject("count", -1);
DBObject sort = new BasicDBObject("$sort", sortFields );
AggregationOutput output = myColl.aggregate(group, sort);
System.out.println( output.getCommandResult() );
The println will print:
{ "serverUsed" : "localhost/127.0.0.1:27017" ,
"result" : [ { "_id" : "localhost:8080/mongoProjekat/treca" , "count" : 3} ,
{ "_id" : "localhost:8080/mongoProjekat/cetvrta" , "count" : 2} ,
{ "_id" : "localhost:8080/mongoProjekat/peta" , "count" : 1}] ,
"ok" : 1.0}
Ty for answer Kay. I allready found solution that is similar to yours.
I have done it this way:
DBObject match = new BasicDBObject("stranica","$dolaznaStr");
DBObject project = new BasicDBObject("$project",match);
DBObject id = new BasicDBObject("_id",new BasicDBObject("stranica","$stranica"));
id.put("posete", new BasicDBObject("$sum", 1));
DBObject group = new BasicDBObject("$group",id);
DBObject srt = new BasicDBObject("posete",-1);
DBObject sort = new BasicDBObject("$sort", srt);
AggregationOutput ao = collection.aggregate(project, group, sort);