MongoDB Java driver - problems with aggregation - java

In Java with MongoDB driver I want to have pairs (business_name, count), meaning reviews count per business. Currently I have aggregate pipeline:
Bson group = group("$business_id", Accumulators.sum("count", 1));
Bson lookupOperation = lookup(
"business",
"_id",
"business_id",
"business_data"
);
Bson unwind = unwind("$business_data");
Bson project = project(fields(include("business_data.name", "count"), excludeId()));
return db
.getCollection("tip")
.aggregate(Arrays.asList(group, lookupOperation, unwind, project));
It works, but returns:
Document{{count=1, business_data=Document{{name=Firestone Complete Auto Care}}}}
1) How can I unwind business_data.name to have {count, name}?
2) How can I make name distinct? I want only 1 count per name, but printing results gives many identical copies, e. g.:
Document{{count=3, business_data=Document{{name=Malmaison}}}}
Document{{count=3, business_data=Document{{name=Malmaison}}}}
Results are quite big collection, so I return AggregateIterable, but I want results sorted by name. Can I do that with iterable, without loading entire data into array and sorting the array?
EDIT
Business document example:
{
"_id" : ObjectId("5ddbc3c1a94f7aac8d179b7c"),
"business_id" : "vcNAWiLM4dR7D2nwwJ7nCA",
"full_address" : "4840 E Indian School Rd\nSte 101\nPhoenix, AZ 85018",
"hours" : {
"Tuesday" : {
"close" : "17:00",
"open" : "08:00"
},
"Friday" : {
"close" : "17:00",
"open" : "08:00"
},
"Monday" : {
"close" : "17:00",
"open" : "08:00"
},
"Wednesday" : {
"close" : "17:00",
"open" : "08:00"
},
"Thursday" : {
"close" : "17:00",
"open" : "08:00"
}
},
"open" : true,
"categories" : [
"Doctors",
"Health & Medical"
],
"city" : "Phoenix",
"review_count" : 7,
"name" : "Eric Goldberg, MD",
"neighborhoods" : [],
"longitude" : -111.983758,
"state" : "AZ",
"stars" : 3.5,
"latitude" : 33.499313,
"attributes" : {
"By Appointment Only" : true
},
"type" : "business"
}
Review document example:
{
"user_id": "IORZRljfUkedhh1SGMthTA",
"text": "The desserts are enormous..dear god.",
"business_id": "JwUE5GmEO-sH1FuwJgKBlQ",
"likes": 0,
"date": "2011-09-29",
"type": "tip"
}

Related

Elasticsearch is not working with Alphanumeric

I am having alphanumeric codes like AA111, 111AA, AA-111, AAAA, 1111. Below is the mapping for elastic search
"name" : {
"type" : "text",
"analyzer" : "standard",
"fields" : {
"lower_case_sort" : {
"type" : "keyword",
"normalizer" : "lowercase"
}
},
"copy_to" : "default"
}
The default mapping is like below
"default" : {
"type" : "text",
"analyzer" : "index_ngram",
"search_analyzer" : "search_ngram"
},
When we search with AAA or AA, It returns results. But when we search by 111 it does not return any result.
Below is the query
"bool" : {
"filter" : [
{
"match" : {
"default" : {
"query" : "111",
"operator" : "AND",
"prefix_length" : 0,
"max_expansions" : 50,
"fuzzy_transpositions" : true,
"lenient" : false,
"zero_terms_query" : "NONE",
"auto_generate_synonyms_phrase_query" : true,
"boost" : 1.0
}
}
},
Its happening as you are using the some analyzer on your default field, which is removing the numbers from your text (simple analyzer is one of them), you need to use a tokeniser which doesn't remove them like edge-ngram and search on them, or use the standard analyzer which also works with the numbers.

Elastic termsQuery not giving expected result

I have an index where each of my objects has status field which can have some predefined values. I want to fetch all of them which has statusINITIATED, UPDATED, DELETED, any match with these and hence created this query by java which I got printing on console, using Querybuilder and nativeSearchQuery, executing by ElasticsearchOperations:
{
"bool" : {
"must" : [
{
"terms" : {
"status" : [
"INITIATED",
"UPDATED",
"DELETED"
],
"boost" : 1.0
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
}
I have data in my index with 'INITIATED' status but not getting anyone with status mentioned in the query. How to fix this query, please?
If you need anything, please let me know.
Update: code added
NativeSearchQueryBuilder nativeSearchQueryBuilder=new NativeSearchQueryBuilder();
QueryBuildersingleQb=QueryBuilders.boolQuery().must(QueryBuilders.termsQuery("status",statusList));
Pageable pageable = PageRequest.of(0, 1, Sort.by(Defs.START_TIME).ascending());
FieldSortBuilder sort = SortBuilders.fieldSort(Defs.START_TIME).order(SortOrder.ASC);
nativeSearchQueryBuilder.withQuery(singleQb);
nativeSearchQueryBuilder.withSort(sort);
nativeSearchQueryBuilder.withPageable(pageable);
nativeSearchQueryBuilder.withIndices(Defs.SCHEDULED_MEETING_INDEX);
nativeSearchQueryBuilder.withTypes(Defs.SCHEDULED_MEETING_INDEX);
NativeSearchQuery searchQuery = nativeSearchQueryBuilder.build();
List<ScheduledMeetingEntity> scheduledList=elasticsearchTemplate.queryForList(searchQuery, ScheduledMeetingEntity.class);
Update 2: sample data:
I got this from kibana query on this index:
"hits" : [
{
"_index" : "index_name",
"_type" : "type_name",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"createTime" : "2021-03-03T13:09:59.198",
"createTimeInMs" : 1614755399198,
"createdBy" : "user1#domain.com",
"editTime" : "2021-03-03T13:09:59.198",
"editTimeInMs" : 1614755399198,
"editedBy" : "user1#domain.com",
"versionId" : 1,
"id" : "1",
"meetingId" : "47",
"userId" : "129",
"username" : "user1#domain.com",
"recipient" : [
"user1#domain.com"
],
"subject" : "subject",
"body" : "hi there",
"startTime" : "2021-03-04T07:26:00.000",
"endTime" : "2021-03-04T07:30:00.000",
"meetingName" : "name123",
"meetingPlace" : "placeName",
"description" : "sfsafsdafsdf",
"projectName" : "",
"status" : "INITIATED",
"failTry" : 0
}
}
]
Confirm your mapping:
GET /yourIndexName/_mapping
And see if it is valid
Your mapping needs to have keyword for TermsQuery to work.
{
"status": {
"type" "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
ES can automatically do the mapping for you (without you having to do it yourself) when you first push a document. However you probably have finer control if you do the mapping yourself.
Either way, you need to have keyword defined for your status field.
=====================
Alternative Solution: (Case Insensitive)
If you have a Field named (status), and the values you want to search for are (INITIATED or UPDATED, or DELETED).
Then you can do it like this:
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery()
.must(createStringSearchQuery());
public QueryBuilder createStringSearchQuery(){
QueryStringQueryBuilder queryBuilder = QueryBuilders.queryStringQuery(" INITIATED OR UPDATED OR DELETED ");
queryBuilder.defaultField("status");
return queryBuilder;
}
Printing the QueryBuilder:
{
"query_string" : {
"query" : "INITIATED OR UPDATED OR DELETED",
"default_field" : "status",
"fields" : [ ],
"type" : "best_fields",
"default_operator" : "or",
"max_determinized_states" : 10000,
"enable_position_increments" : true,
"fuzziness" : "AUTO",
"fuzzy_prefix_length" : 0,
"fuzzy_max_expansions" : 50,
"phrase_slop" : 0,
"escape" : false,
"auto_generate_synonyms_phrase_query" : true,
"fuzzy_transpositions" : true,
"boost" : 1.0
}
}

MongoDB Spring data, max aggregate with complex condition

I am using mongodb as a document oriented database, and spring data as the ODM with it. I am facing hard time, performing a max aggregation on complex bson structure.
I have to find the max date, from all documents but if the document has an embedded document, it has to consider that embedded document for the max date.
Here is an example, lets suppose i have a collection name person and person collection contains following documents.
{
"_id" : ObjectId("55def1ceb5b5ed74ddf2b5ce"),
"name" : "abc",
"birth_date_time" : '15 June 1988'
"children" : {
"_id" : ObjectId("55def1ceb2223ed74ddf2b5ce"),
"name" : "def",
"birth_date_time" : '10 April 2010'
}
},
{
"_id" : ObjectId("55def1ceb5b5ed74dd232323"),
"name" : "xyz",
"birth_date_time" : '15 June 1986'
},
{
"_id" : ObjectId("55def1ceb5b5ed74ddf2b5ce"),
"name" : "mno",
"birth_date_time" : '18 March 1982'
"children" : {
"_id" : ObjectId("534ef1ceb2223ed74ddf2b5ce"),
"name" : "pqr",
"birth_date_time" : '10 April 2009'
}
}
It should return 10 April 2010 as this the max birth date for a person in the collection person. I want to know who to achieve it using spring data repository.
Here are the MongoDB aggregations. They should be easily implemented in Spring Data.
db.person.aggregate([
{$group: {
_id: null,
maxDate: {$max : {
$cond: [
{$gt : ["$birth_date_time","$children.birth_date_time"]},
"$birth_date_time",
"$children.birth_date_time"
]}}
}}
])
or using a $project:
db.person.aggregate([{
$project: {
mDate: {
$cond: [
{$gt : ["$birth_date_time","$children.birth_date_time"]},
"$birth_date_time",
"$children.birth_date_time"
]
}
}},
{$group: {
_id: null,
maxDate: {$max : "$mDate"}
}},
])

MongoDB - MongoJack find and remove

I am completely new to MongoDB and MongoJack.
I have this collection called pbf
{
"_id" : ObjectId("541ea72044ae1b4043e9ccba"),
"name" : "First civ game",
"type" : "WAW",
"numOfPlayers" : 4,
"active" : true,
"players" : [ ],
"civs" : [
{
"objectType" : "civ",
"name" : "Indians",
"used" : false,
"hidden" : true
},
{
"objectType" : "civ",
"name" : "Arabs",
"used" : false,
"hidden" : true
},
{
"objectType" : "civ",
"name" : "Japanese",
"used" : false,
"hidden" : true
},
{
"objectType" : "civ",
"name" : "Egyptians",
"used" : false,
"hidden" : true
},
}
What I want to do "Remove and return one civs item by Id"
I have tried something like this:
protected static JacksonDBCollection<PBF, String> pbfCollection;
BasicDBObject field = new BasicDBObject();
field.put("civs", 1);
field.put("_id", "541ea72044ae1b4043e9ccba");
PBF pbf = pbfCollection.findAndRemove(field)
But this just throws exception saying it doesn't find anything
So bascially I want this returned
{
"objectType" : "civ",
"name" : "Indians",
"used" : false,
"hidden" : true
}
How can I accomplish this?
I solved it using two steps. I am sure though there is a better way of doing it.
//First get, then remove, then update
PBF pbf = pbfCollection.findOneById(pbfId);
Civ civ = pbf.getCivs().remove(0);
pbfCollection.updateById(pbf.getId(), pbf);
This worked, but I think it should be a better way of doing it

How to update embedded document in mongo?

I am having the following document in the mongo db.
{ "_id" : ObjectId("50656f33a4e82d3f98291eff"),
"description" : "gdfgdfgdfg",
"menus" : [
{
"name" : "gdfgdfgdfg",
"description" : "dfgdgd",
"text" : "dfgdfg",
"key" : "2",
"onSelect" : "yyy",
"_id" : ObjectId("50656f3ca4e82d3f98291f00")
},
{
"name" : "rtytry",
"description" : "gffhf",
"text" : "dfgdfg",
"key" : "2",
"onSelect" : "yyy",
"_id" : ObjectId("50656f3ca4e82d3f98281f00")
}],
"select":"ffdfgd"
}
I want to do automatic update of menus
{
"name" : "gdfgdfgdfg",
"description" : "dfgdgd",
"text" : "dfgdfg",
"key" : "2",
"onSelect" : "yyy",
"_id" : ObjectId("50656f3ca4e82d3f98291f00")
}
I have tried using the following code:
BasicDBObject query = new BasicDBObject("_id", new ObjectId("50656f33a4e82d3f98291eff"));
BasicDBObject update = new BasicDBObject("_id", ObjectId("50656f3ca4e82d3f98291f00"));
BasicDBObject updateCommand = new BasicDBObject("$set", new BasicDBObject("menus", update));
collection.update(query, updateCommand);
The result i got is
{ "_id" : ObjectId("50656f33a4e82d3f98291eff"),
"description" : "gdfgdfgdfg",
"menus" :
{
"name" : "gdfgdfgdfg",
"description" : "dfgdgd",
"text" : "dfgdfg",
"key" : "2",
"onSelect" : "yyy",
"_id" : ObjectId("50656f3ca4e82d3f98291f00")
},
"select":"ffdfgd"
}
but i want to update in the same embedded document.
Any one guide me ....Thanks in advance
Ok After having to read this a couple of times (English...) I think I understand now.
You want to upto the subdocument with the _id 50656f3ca4e82d3f98291f00 with:
{
"name" : "gdfgdfgdfg",
"description" : "dfgdgd",
"text" : "dfgdfg",
"key" : "2",
"onSelect" : "yyy",
"_id" : ObjectId("50656f3ca4e82d3f98291f00")
}
First problem you have is that your code merely sets menus field to this object. What you need is the postional operator. So you need to do a dot notation find (warning my Java is a little rusty):
query.append("menus._id", new ObjectId("50656f3ca4e82d3f98291f00"));
And then use that positional operator to update in position:
BasicDBObject update_document = new BasicDBObject("menus.$.name", my_new_subdocument.name);
update_document.append("menus.$.description", my_new_subdocument.description);
// All the other fields
BasicDBObject updateCommand = new BasicDBObject("$set", update_document);
Hopefully that should do the trick.

Categories