Find the average of array element in MongoDB using java - java

I am new to MongoDB. I have to find the average of array element in Mongo DB
e.g
{
"_id" : ObjectId("51236fbc3004f02f87c62e8e"),
"query" : "iPad",
"rating" : [
{
"end" : "130",
"inq" : "403",
"executionTime" : "2013-02-19T12:27:40Z"
},
{
"end" : "152",
"inq" : "123",
"executionTime" : "2013-02-19T12:35:28Z"
}
]
}
I want the average of "inq" where query:iPad
Here the output should be:
inq=263
I searched in google and got the aggregate method but not able to convert it in java code.
Thanks in advance
Regards

Let's try to decompose that problem. I would start with:
db.c.aggregate({$match: {query: "iPad"}}, {$unwind:"$rating"}, {$project: {_id:0,
q:"$query",i:"$rating.inq"}})
The projection is not required but makes the rest a little bit more readable:
{
"result" : [
{
"q" : "iPad",
"i" : "403"
},
{
"q" : "iPad",
"i" : "123"
}
],
"ok" : 1
}
So how do I group that? Of course, by "$q":
db.c.aggregate({$match: {query: "iPad"}}, {$unwind:"$rating"}, {$project: {_id:0,
q:"$query",i:"$rating.inq"}}, {$group:{_id: "$q"}}) :
{ "result" : [ { "_id" : "iPad" } ], "ok" : 1 }
Now let's add some aggregation operators:
db.c.aggregate({$match: {query: "iPad"}}, {$unwind:"$rating"}, {$project: {_id:0, q:"$query",i:"$rating.inq"}}, {$group:{_id: "$q", max: {$max: "$i"}, min: {$min: "$i"}}}) :
{
"result" : [
{
"_id" : "iPad",
"max" : "403",
"min" : "123"
}
],
"ok" : 1
}
Now comes the average:
db.c.aggregate({$match: {query: "iPad"}}, {$unwind:"$rating"}, {$project:
{_id:0,q:"$query",i:"$rating.inq"}}, {$group:{_id: "$q", av: {$avg:"$i"}}});

Try to get the java drivers for mongodb. I am able to get this link from mongodb site. Please check this: http://docs.mongodb.org/ecosystem/tutorial/use-aggregation-framework-with-java-driver/#java-driver-and-aggregation-framework

Related

MongoTemplate upsert property snapshot

I'm using mongo 4.2.15
Here is entry:
{
"keys": {
"country": "US",
"channel": "c999"
},
"counters": {
"sale": 0
},
"increments": null
}
I want to be able to initialize counter set as well as increment counters.sale value and save increment result snapshot to increments property. Something like that:
db.getCollection('counterSets').update(
{ "$and" : [
{ "keys.country" : "US"},
{ "keys.channel" : "c999"}
]
},
{ "$inc" :
{ "counters.sale" : 10
},
"$set" :
{ "keys" :
{ "country" : "US", "channel" : "c999"},
"increments":
{ "3000c058-b8a7-4cff-915b-4979ef9a6ed9": {"counters" : "$counters"} }
}
},
{upsert: true})
The result is:
{
"_id" : ObjectId("61965aba1501d6eb40588ba0"),
"keys" : {
"country" : "US",
"channel" : "c999"
},
"counters" : {
"sale" : 10.0
},
"increments" : {
"3000c058-b8a7-4cff-915b-4979ef9a6ed9" : {
"counters" : "$counters"
}
}
}
Does it possible to do such update which is some how copy increment result from root object counters to child increments.3000c058-b8a7-4cff-915b-4979ef9a6ed9.counters with a single upsert. I want to implement safe inrement. Maybe you can suggest some another design?
In order to use expressions, your $set should be part of aggregation pipeline. So your query should look like
NOTE: I've added square brackets to the update
db.getCollection('counterSets').update(
{ "$and" : [
{ "keys.country" : "US"},
{ "keys.channel" : "c999"}
]
},
[ {"$set": {"counters.sale": {"$sum":["$counters.sale", 10]}}}, {"$set": {"increments.x": "$counters"}}],
{upsert: true})
I haven't found any information about the atomicity of aggregation pipelines, so use this carefully.

Elastic termsQuery not giving expected result

I have an index where each of my objects has status field which can have some predefined values. I want to fetch all of them which has statusINITIATED, UPDATED, DELETED, any match with these and hence created this query by java which I got printing on console, using Querybuilder and nativeSearchQuery, executing by ElasticsearchOperations:
{
"bool" : {
"must" : [
{
"terms" : {
"status" : [
"INITIATED",
"UPDATED",
"DELETED"
],
"boost" : 1.0
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
}
I have data in my index with 'INITIATED' status but not getting anyone with status mentioned in the query. How to fix this query, please?
If you need anything, please let me know.
Update: code added
NativeSearchQueryBuilder nativeSearchQueryBuilder=new NativeSearchQueryBuilder();
QueryBuildersingleQb=QueryBuilders.boolQuery().must(QueryBuilders.termsQuery("status",statusList));
Pageable pageable = PageRequest.of(0, 1, Sort.by(Defs.START_TIME).ascending());
FieldSortBuilder sort = SortBuilders.fieldSort(Defs.START_TIME).order(SortOrder.ASC);
nativeSearchQueryBuilder.withQuery(singleQb);
nativeSearchQueryBuilder.withSort(sort);
nativeSearchQueryBuilder.withPageable(pageable);
nativeSearchQueryBuilder.withIndices(Defs.SCHEDULED_MEETING_INDEX);
nativeSearchQueryBuilder.withTypes(Defs.SCHEDULED_MEETING_INDEX);
NativeSearchQuery searchQuery = nativeSearchQueryBuilder.build();
List<ScheduledMeetingEntity> scheduledList=elasticsearchTemplate.queryForList(searchQuery, ScheduledMeetingEntity.class);
Update 2: sample data:
I got this from kibana query on this index:
"hits" : [
{
"_index" : "index_name",
"_type" : "type_name",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"createTime" : "2021-03-03T13:09:59.198",
"createTimeInMs" : 1614755399198,
"createdBy" : "user1#domain.com",
"editTime" : "2021-03-03T13:09:59.198",
"editTimeInMs" : 1614755399198,
"editedBy" : "user1#domain.com",
"versionId" : 1,
"id" : "1",
"meetingId" : "47",
"userId" : "129",
"username" : "user1#domain.com",
"recipient" : [
"user1#domain.com"
],
"subject" : "subject",
"body" : "hi there",
"startTime" : "2021-03-04T07:26:00.000",
"endTime" : "2021-03-04T07:30:00.000",
"meetingName" : "name123",
"meetingPlace" : "placeName",
"description" : "sfsafsdafsdf",
"projectName" : "",
"status" : "INITIATED",
"failTry" : 0
}
}
]
Confirm your mapping:
GET /yourIndexName/_mapping
And see if it is valid
Your mapping needs to have keyword for TermsQuery to work.
{
"status": {
"type" "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
ES can automatically do the mapping for you (without you having to do it yourself) when you first push a document. However you probably have finer control if you do the mapping yourself.
Either way, you need to have keyword defined for your status field.
=====================
Alternative Solution: (Case Insensitive)
If you have a Field named (status), and the values you want to search for are (INITIATED or UPDATED, or DELETED).
Then you can do it like this:
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery()
.must(createStringSearchQuery());
public QueryBuilder createStringSearchQuery(){
QueryStringQueryBuilder queryBuilder = QueryBuilders.queryStringQuery(" INITIATED OR UPDATED OR DELETED ");
queryBuilder.defaultField("status");
return queryBuilder;
}
Printing the QueryBuilder:
{
"query_string" : {
"query" : "INITIATED OR UPDATED OR DELETED",
"default_field" : "status",
"fields" : [ ],
"type" : "best_fields",
"default_operator" : "or",
"max_determinized_states" : 10000,
"enable_position_increments" : true,
"fuzziness" : "AUTO",
"fuzzy_prefix_length" : 0,
"fuzzy_max_expansions" : 50,
"phrase_slop" : 0,
"escape" : false,
"auto_generate_synonyms_phrase_query" : true,
"fuzzy_transpositions" : true,
"boost" : 1.0
}
}

How to get the count of element with non-empty-array-field when group in mongodb aggregate using Spring Data Mongo?

I have the following documents in one collection named as mail_test. Some of them have a tags field which is an array:
/* 1 */
{
"_id" : ObjectId("601a7c3a57c6eb4c1efb84ff"),
"email" : "aaaa#bbb.com",
"content" : "11111"
}
/* 2 */
{
"_id" : ObjectId("601a7c5057c6eb4c1efb8590"),
"email" : "aaaa#bbb.com",
"content" : "22222"
}
/* 3 */
{
"_id" : ObjectId("601a7c6d57c6eb4c1efb8675"),
"email" : "aaaa#bbb.com",
"content" : "33333",
"tags" : [
"x"
]
}
/* 4 */
{
"_id" : ObjectId("601a7c8157c6eb4c1efb86f4"),
"email" : "aaaa#bbb.com",
"content" : "4444",
"tags" : [
"yyy",
"zzz"
]
}
There are two documents with non-empty-tags, so I want the result to be 2.
I use the the following statement to aggregate and get the correct tag_count:
db.getCollection('mail_test').aggregate([{$group:{
"_id":null,
"all_count":{$sum:1},
"tag_count":{"$sum":{$cond: [ { $ne: ["$tags", undefined] }, 1, 0]}}
//if replace `undefined` with `null`, I got the tag_count as 4, that is not what I want
//I also have tried `$exists`, but it cannot be used here.
}}])
and the result is:
{
"_id" : null,
"all_count" : 4.0,
"tag_count" : 2.0
}
and I use spring data mongo in java to do this:
private void test(){
Aggregation agg = Aggregation.newAggregation(
Aggregation.match(new Criteria()),//some condition here
Aggregation.group(Fields.fields()).sum(ConditionalOperators.when(Criteria.where("tags").ne(null)).then(1).otherwise(0)).as("tag_count")
//I need an `undefined` instead of `null`,or is there are any other solution?
);
AggregationResults<MailTestGroupResult> results = mongoTemplate.aggregate(agg, MailTest.class, MailTestGroupResult.class);
List<MailTestGroupResult> mappedResults = results.getMappedResults();
int tag_count = mappedResults.get(0).getTag_count();
System.out.println(tag_count);//get 4,wrong
}
I need an undefined instead of null but I don't know how to do this,or is there are any other solution?
You can use Aggregation operators to check if the field tags exists or not with one of the following constructs in the $group stage of your query (to calculate the tag_count value):
"tag_count":{ "$sum": { $cond: [ { $gt: [ { $size: { $ifNull: ["$tags", [] ] }}, 0 ] }, 1, 0] }}
// - OR -
"tag_count":{ "$sum": { $cond: [ $eq: [ { $type: "$tags" }, "array" ] }, 1, 0] }
Both, return the same result (as you had posted).

Cassandra: Java equivalent method to `sstabledump`?

I'm writing test for a java program which outputs cassandra sstable files (e.g. mc-1-big-Data.db). Now what I have is an correct output file (correct.db). To check if the program is correct or not, I need to dump the two files, and compare each field inside excep "liveness_info" (which means I cannot directly compare mc-1-big-Data.db with correct.db).
I know I can use sstabledump to do this. But this command runs in shell and I may not call it directly in Java since I'm supposed to do a unit test. Therefore, I'd like to have an equivalent method in Java. But after searching for a long time I still didn't find any. Could anyone give some suggestions? Thanks!
[update]
I've tried the method #JimWartnick mentions. When I used SSTableExport.main(...), I got this error java.lang.ExceptionInInitializerError caused by org.apache.cassandra.exceptions.ConfigurationException: Expecting URI in variable: [cassandra.config]. Found[cassandra.yaml]. Please prefix the file with [file:///] for local files and [file://<server>/] for remote files.
It seems this requires me to setup cassandra server, but I suspect for a unit test I cannot do this. Any suggestions? Thanks!
[example dump]
Just in case it's helpful, I attach the example dump below.
[
{
"partition" : {
"key" : [ "1" ],
"position" : 0
},
"rows" : [
{
"type" : "row",
"position" : 33,
"liveness_info" : { "tstamp" : "2019-08-15T23:52:11.715Z" },
"cells" : [
{ "name" : "name", "value" : "Amy" }
]
}
]
},
{
"partition" : {
"key" : [ "2" ],
"position" : 34
},
"rows" : [
{
"type" : "row",
"position" : 67,
"liveness_info" : { "tstamp" : "2019-08-15T23:52:11.738Z" },
"cells" : [
{ "name" : "name", "value" : "Bob" }
]
}
]
},
{
"partition" : {
"key" : [ "4" ],
"position" : 68
},
"rows" : [
{
"type" : "row",
"position" : 103,
"liveness_info" : { "tstamp" : "2019-08-15T23:52:11.738Z" },
"cells" : [
{ "name" : "name", "value" : "Caleb" }
]
}
]
},
{
"partition" : {
"key" : [ "3" ],
"position" : 104
},
"rows" : [
{
"type" : "row",
"position" : 133,
"liveness_info" : { "tstamp" : "2019-08-15T23:52:11.738Z" },
"cells" : [
{ "name" : "name", "value" : "" }
]
}
]
}

Spring MongoDB Aggregation Group - Can't get the group query right

I have a document in a MongoDB, which looks like follows.
{
"_id" : ObjectId("5ceb812b3ec6d22cb94c82ca"),
"key" : "KEYCODE001",
"values" : [
{
"classId" : "CLASS_01",
"objects" : [
{
"code" : "DD0001"
},
{
"code" : "DD0010"
}
]
},
{
"classId" : "CLASS_02",
"objects" : [
{
"code" : "AD0001"
}
]
}
]
}
I am interested in getting a result like follows.
{
"classId" : "CLASS_01",
"objects" : [
{
"code" : "DD0001"
},
{
"code" : "DD0010"
}
]
}
To get this, I came up with an aggregation pipeline in Robo 3T, which looks like follows. And it's working as expected.
[
{
$match:{
'key':'KEYCODE001'
}
},
{
"$unwind":{
"path": "$values",
"preserveNullAndEmptyArrays": true
}
},
{
"$unwind":{
"path": "$values.objects",
"preserveNullAndEmptyArrays": true
}
},
{
$match:{
'values.classId':'CLASS_01'
}
},
{
$project:{
'object':'$values.objects',
'classId':'$values.classId'
}
},
{
$group:{
'_id':'$classId',
'objects':{
$push:'$object'
}
}
},
{
$project:{
'_id':0,
'classId':'$_id',
'objects':'$$objects'
}
}
]
Now, when I try to do the same in a SpringBoot application, I can't get it running. I ended up having the error java.lang.IllegalArgumentException: Invalid reference '$complication'!. Following is what I have done in Java so far.
final Aggregation aggregation = newAggregation(
match(Criteria.where("key").is("KEYCODE001")),
unwind("$values", true),
unwind("$values.objects", true),
match(Criteria.where("classId").is("CLASS_01")),
project().and("$values.classId").as("classId").and("$values.objects").as("object"),
group("classId", "objects").push("$object").as("objects").first("$classId").as("_id"),
project().and("$_id").as("classId").and("$objects").as("objects")
);
What am I doing wrong? Upon research, I found that multiple fields in group does not work or something like that (please refer to this question). So, is what I am currently doing even possible in Spring Boot?
After hours of debugging + trial and error, found the following solution to be working.
final Aggregation aggregation = newAggregation(
match(Criteria.where("key").is("KEYCODE001")),
unwind("values", true),
unwind("values.objects", true),
match(Criteria.where("values.classId").is("CLASS_01")),
project().and("values.classId").as("classId").and("values.objects").as("object"),
group(Fields.from(Fields.field("_id", "classId"))).push("object").as("objects"),
project().and("_id").as("classId").and("objects").as("objects")
);
It all boils down to group(Fields.from(Fields.field("_id", "classId"))).push("object").as("objects") that which introduces a org.springframework.data.mongodb.core.aggregation.Fields object that wraps a list of org.springframework.data.mongodb.core.aggregation.Field objects. Within Field, the name of the field and the target could be encapsulated. This resulted in the following pipeline which is a match for the expected.
[
{
"$match" :{
"key" : "KEYCODE001"
}
},
{
"$unwind" :{
"path" : "$values", "preserveNullAndEmptyArrays" : true
}
},
{
"$unwind" :{
"path" : "$values.objects", "preserveNullAndEmptyArrays" : true
}
},
{
"$match" :{
"values.classId" : "CLASS_01"
}
},
{
"$project" :{
"classId" : "$values.classId", "object" : "$values.objects"
}
},
{
"$group" :{
"_id" : "$classId",
"objects" :{
"$push" : "$object"
}
}
},
{
"$project" :{
"classId" : "$_id", "objects" : 1
}
}
]
Additionally, figured that there is no need to using $ sign anywhere and everywhere.

Categories