I have a large-ish dataset (+100,000) documents. In each document, there is a key value object called ProjectCategories for which I would like to be able to get distinct count on each of the values across the entire collection.
For example my document looks like this:
"_id" : ObjectId("60e5ae42fcc92f14c3a41208"),
"userId" : "xxxx",
"projectCreator" : {
"userId" : "xxx|xxxx"
},
"hashTags" : [
"Spring",
"Java"
],
"projectCategories" : {
"60d76ef0597444095b8ab4b2" : "Backend",
"60d76ef0597444095b8ab232" : "Infrastructure"
},
"createdDate" : ISODate("2021-07-07T13:38:10.655Z"),
"updatedAt" : ISODate("2021-07-08T11:48:36.200Z"),
"_class" : "xxxx.model.project.Project"
}
I would like to get back all the unique values and their counts. Something like this:
Backend : 1002
FrontEnd : 1232
Infrastructure: 902
Is this possible to do using Java and mongoTemplate
Thanks for reading
I using DIH to import the data from a NoSQL data source . The format looks something like below
{
"id" : "123-145-app"
"name" : "apple",
"type" : "electronic",
"information": {
"category":["tablets","laptops","mobile"],
"stores": [
{
"name": "imagine",
"location" : "DLH"
},
{
"name": "abc",
"location" : "BLR"
}
],
"head_office" : "US"
}
}
when I try to index this using :
https://lucene.apache.org/solr/guide/6_6/transforming-and-indexing-custom-json.html
I am getting error stating : "Unknown operation for the an atomic update, operation ignored" , and the record with that data is getting skipped.
And in the documentation they have not mentioned about how to incorporate the split via configuration file or schema.xml
Can some one please help me with this ?
I'm needing to retrieve just with two dates, all the documents from my MongoDB's collection, with the filtered items from the array.
This is an example of 2 of my documents;
{
"_id" : ObjectId("5f18fa823406b7000132d097"),
"last_date" : "22/07/2020 23:48:32",
"history_dates" : [
"22/07/2020 23:48:32",
"22/07/2020 00:18:53",
"23/07/2020 00:49:12",
"23/07/2020 01:19:30"
],
"hostname" : "MyHostname1",
"ip" : "142.0.111.79",
"component" : "C:\\Windows\\System32\\es-ES\\KernelBase.dll.mui",
"process" : "LogonUI.exe",
"date" : "23/07/2020 10:26:04",
}
{
"_id" : ObjectId("5f18fa823406b7000132d098"),
"last_date" : "22/07/2020 23:48:33",
"history_dates" : [
"22/07/2020 23:48:33",
"23/07/2020 00:18:53",
],
"hostname" : "MyHostName2",
"ip" : "142.0.111.54",
"component" : "C:\\Windows\\System32\\es-ES\\KernelBase.dll.mui",
"process" : "svchost.exe",
"date" : "23/07/2020 10:26:04",
}
I'm needing to make a find to my database (Using Spring Data), to retrieve the same objects, but with the "history_dates"'s array filtered between the 2 dates recieved.
For example, if my 2 recieved dates are: "23/07/2020" and "24/07/2020", I want MongoDB to return the next objects;
{
"_id" : ObjectId("5f18fa823406b7000132d097"),
"last_date" : "22/07/2020 23:48:32",
"history_dates" : [
"23/07/2020 00:49:12",
"23/07/2020 01:19:30"
],
"hostname" : "MyHostname1",
"ip" : "142.0.111.79",
"component" : "C:\\Windows\\System32\\es-ES\\KernelBase.dll.mui",
"process" : "LogonUI.exe",
"date" : "23/07/2020 10:26:04",
}
{
"_id" : ObjectId("5f18fa823406b7000132d098"),
"last_date" : "22/07/2020 23:48:33",
"history_dates" : [
"23/07/2020 00:18:53"
],
"hostname" : "MyHostName2",
"ip" : "142.0.111.54",
"component" : "C:\\Windows\\System32\\es-ES\\KernelBase.dll.mui",
"process" : "svchost.exe",
"date" : "23/07/2020 10:26:04",
}
I'm really ignorant about MongoDB's queries, and I have been trying to make this with Spring Data all the week.
UPDATE 1.
Thanks varman, and do you know how can i just retrieve the documents with filtered arrays not empty?
So basically you need to do filter. MongoTemplate offers a lot of operation for mongodb, if some methods don't exist in MongoTemplate, we can go with Bson Document pattern. In that case, try this article: Trick to covert mongo shell query.
Actually you need a Mongo query something like following. Using $addFields one of the methods shown below. But you can use $project, $set etc. Here $addFields overwrites your history_dates. (It uses to add new fields to document too).
{
$addFields: {
history_dates: {
$filter: {
input: "$history_dates",
cond: {
$and: [{
$gt: ["$$this", "23/07/2020"]
},
{
$lt: ["$$this", "24/07/2020"]
}
]
}
}
}
}
}
Working Mongo playground.
You need to convert this into spring data. So #Autowired the MongoTemplate in you class.
#Autowired
MongoTemplate mongoTemplate;
The method is,
public List<Object> filterDates(){
Aggregation aggregation = Aggregation.newAggregation(
a->new Document("$addFields",
new Document("history_dates",
new Document("$filter",
new Document("input","$history_dates")
.append("cond",
new Document("$and",
Arrays.asList(
new Document("$gt",Arrays.asList("$$this","23/07/2020")),
new Document("$lt",Arrays.asList("$$this","24/07/2020"))
)
)
)
)
)
)
).withOptions(AggregationOptions.builder().allowDiskUse(Boolean.TRUE).build());
return mongoTemplate.aggregate(aggregation, mongoTemplate.getCollectionName(YOUR_CLASS.class), Object.class).getMappedResults();
}
Mongo template doesn't provide add methods for $addFields and $filter. So we just go with bson document pattern. I haven't tested this in Spring.
My database looks like
{
"_id" : ObjectId("5a8351093cf24e144d8fef24"),
"__type" : "TrafficIncident:http://schemas.microsoft.com/search/local/ws/rest/v1",
"point" : {
"type" : "Point",
"coordinates" : [
37.410883,
-95.71027
]
},
...
}
{
"_id" : ObjectId("5a8351093cf24e144d8fef25"),
"__type" : "TrafficIncident:http://schemas.microsoft.com/search/local/ws/rest/v2",
"point" : {
"type" : "Point",
"coordinates" : [
40.2346,
-100.826167
]
},
...
}
If I have a coordinates pair as center location, say [38, -98], and I want to retrieve all records with in coordinate range [38 +- 2, -98 +- 2], how to write java code for the Document Filter?
So far what I have done is retrieving a specific location instead of inside a range.
Document query = new Document("point.coordinates", Arrays.asList(40.2346, -100.826167));
javamongo.collection.find(query).limit(javamongo.numLimit).forEach(printBlock);
You'll want to use MongoDB's Geospatial Query system for this.
I'm assuming you're using Mongo's official Java Driver. First you'll want to create a 2dsphere index on the point.coordinates field. You can do this in Java with:
collection.createIndex(Indexes.geo2dsphere("point.coordinates"));
Then you can query for all documents within your location range with:
Point refPoint = new Point(new Position(38, -98));
collection.find(Filters.near("point.coordinates", refPoint, 2, 2)).forEach(printBlock);
MongoDB's tutorial on geospatial search with their Java driver is pretty good.
I have a json object as following:
{ "_id" : ObjectId("508806803bb97dc546e6f307"), "user_name" : "user1", "user_id" : 45645645, "likes" : [ { "event_id" : NumberLong("4578541212") },{ "event_id" : NumberLong("4578541213") } ], "dislikes" : [ ] }
I'm trying to delete specific event within likes array via java drivers
tried doing this first in shell:
> db.users.update( {'likes.event_id' : 4578541212}, { '$unset':{'likes.event_id'
:1}})
with no luck...how can I manage doing that?
If you want to just remove the event_id field from the array element:
db.users.update( {'likes.event_id' : 4578541212}, {'$unset':{'likes.$.event_id' :1}})
Use the $pull operator to delete the element:
db.users.update({'likes.event_id': 4578541212}, {'$pull':{likes: {event_id: 4578541212}}})