Can somebody guide me on the aggregation query in Java for the following Mongo query. I am trying to sum up the distance covered every day by the vehicle. There are some duplicate records (which I cannot eliminate) so I have to use group by to filter them out.
db.collection1.aggregate({ $match: { "vehicleId": "ABCDEFGH", $and: [{ "timestamp": { $gt: ISODate("2022-08-24T00:00:00.000+0000") } }, { "timestamp": { $lt: ISODate("2022-08-25T00:00:00.000+0000") } }, { "distanceMiles": { "$gt": 0 } }] } }, { $group: {"_id": {vehicleId: "$vehicleId", "distanceMiles" : "$distanceMiles" } } }, { $group: { _id: null, distance: { $sum: "$_id.distanceMiles" } } })
If possible can you also suggest some references? I am stuck at the last group by involving $_id part.
The Java code that I have except the last group by is:
Criteria criteria = new Criteria();
criteria.andOperator(Criteria.where("timestamp").gte(start).lte(end),
Criteria.where("vehicleId").in(vehicleIdList));
Aggregation aggregation = Aggregation.newAggregation(Aggregation.match(criteria),
Aggregation.sort(Direction.DESC, "timestamp"),
Aggregation.project("distanceMiles", "vehicleId", "timestamp").and("timestamp")
.dateAsFormattedString("%Y-%m-%d").as("yearMonthDay"),
Aggregation.group("vehicleId", "yearMonthDay").first("vehicleId").as("vehicleId").
first("timestamp").as("lastReported").sum("distanceMiles").as("distanceMiles"));
Note. there is a slight difference between the raw mongo query and the query in Java on the date param.
Generally if you are looking for advice on how to directly convert an aggregation pipeline into Java code (not necessarily using the builders), check out this answer.
I'm not really clear on what component you're currently stuck on though. Is it just the direct translation between the aggregation pipeline and the Java code? Is the aggregation pipeline not giving correct results? You haven't mentioned some information such as driver version that would help us advise further if needed.
A few other general things come to mind that might be worth mentioning:
The sample .aggregate() snippet you provided does not have the square brackets ([ and ]) wrapping the pipeline which would be needed in the shell.
When referencing existing field names, you probably need to prefix them with $ in the Java code similar to how you do in the shell.
You should be able to access the values nested inside of the _id field after the first $group stage using dot notation (eg "$_id.distanceMiles") as you are in the sample aggregation.
Depending on which specific driver you are using, documentation such as this may be helpful with respect to working with the builders.
Related
I am using "Wildcard text index" in order to search for a pattern in every fields of my class. I am also using projection in order to remove a certain field:
#Query(value = "{$text: { $search: ?0 }, fields = "{'notWantedField':0}")
However, I would like to prevent from matching something from the unwanted field.
In other words, I would like first to project (and remove fields), then search on the remaining fields.
Is there a way to combine projection and search while keeping the wildcard search?
Thanks a lot.
I am using spring-data-mongodb 1.10.8
A possible solution could be a $and operator combined with a $regex.
For example following the Mongodb documentation https://docs.mongodb.com/manual/reference/operator/query/text, if you suppose to create a text index combining subject and author (db.articles.createIndex({"author": "text", "subject": "text"}), you can exclude author field with this query:
db.articles.find( {$and: [{ $text: { $search: "coffee" } }, {"author": {'$regex' : '^((?!coffe).)*$', '$options' : 'i'}}]}, {"author": 0})
In your case, considering that your index is a wildcard, you must exclude, using the regex, all the fields that are also in the projection.
I am using elasticsearch for filtering and searching from json file and I am newbie in this technology. So I am little bit confused how to write like query in elasticsearch.
select * from table_name where 'field_name' like 'a%'
This is mysql query. How do I write this query in Elasticsearch? I am using elasticsearch version 0.90.7.
I would highly suggest updating your ElasticSearch version if possible, there have been significant changes since 0.9.x.
This question is not quite specific enough, as there are many ways ElasticSearch can fulfill this functionality, and they differ slightly on your overall goal. If you are looking to replicate that SQL query exactly then in this case use the wildcard query or prefix query.
Using a wildcard query:
Note: Be careful with wildcard searches, they are slow. Avoid using wildcards at the beginning of your strings.
GET /my_index/table_name/_search
{
"query": {
"wildcard": {
"field_name": "a*"
}
}
}
Or Prefix query
GET /my_index/table_name/_search
{
"query": {
"prefix": {
"field_name": "a"
}
}
}
Or partial matching:
Note: Do NOT blindly use partial matching, while there are corner cases for it's use, correct use of analyzers is almost always better.
Also this exact query will be equivalent to LIKE '%a%', which again, could be better setup with correct use of mapping and a normal query search!
GET /my_index/table_name/_search
{
"query": {
"match_phrase": {
"field_name": "a"
}
}
}
If you are reading this wondering about querying ES similarly for search-as-you-type I would suggest reading up on edge-ngrams, which relate to proper use of mapping depending on what you are attempting to do =)
GET /indexName/table_name/_search
{
"query": {
"match_phrase": {
"field_name": "your partial text"
}
}
}
You can use "type" : "phrase_prefix" to prefix or post fix you search
Java code for the same:
AndFilterBuilder andFilterBuilder = FilterBuilders.andFilter();
andFilterBuilder.add(FilterBuilders.queryFilter(QueryBuilders.matchPhraseQuery("field_name",
"your partial text")));
Gave 'and filter' example so that you can append extra filters if you want to.
Check this for more detail:
https://www.elastic.co/guide/en/elasticsearch/guide/current/slop.html
Below query I wrote, this is something like
SELECT * FROM TABLE WHERE api='payment' AND api_v='v1' AND status='200' AND response LIKE '%expired%' AND response LIKE '%token%'
Please note table = document here
GET/POST both accepted
GET /transactions-d-2021.06.24/_search
{
"query":{
"bool":{
"must":[
{
"match":{
"api":"payment"
}
},
{
"match":{
"api_v":"v1"
}
},
{
"match":{
"status":"200"
}
},
{
"wildcard":{
"response":"*expired*"
}
},
{
"wildcard":{
"response":"*token*"
}
}
]
}
}
}
Writing a custom bool query worked for me
#Query("{\"bool\":{\"should\":[{\"query_string\":{\"fields\":[\"field_name\"],\"query\":\"?0*\"}}]}}")
I googled and read the official doc of mongodb (http://docs.mongodb.org/manual/core/index-intersection/), but didn't find any tutorial or indications on syntax of query using index intersection.
Does mongodb apply automatically index intersection when the query involves 2 fields which are separately indexed by a single index? I don't think so.
Here is what cursor.explain() show when i run a query between 2 dates and a given "name" ("name" is a field, both date and name are indexed.)
{
"cursor": "BtreeCursor Name_1",
"isMultiKey": false,
"n": 99330,
"nscannedObjects": 337500,
"nscanned": 337500,
"nscannedObjectsAllPlans": 337601,
"nscannedAllPlans": 337705,
"scanAndOrder": false,
"indexOnly": false,
"nYields": 18451,
"nChunkSkips":
"millis": 15430,
"indexBounds": {
"Name": [
[
"blabla",
"blabla"
]
]
},
"allPlans": [
{
"cursor": "BtreeCursor Name_1",
"isMultiKey": false,
"n": 99330,
"nscannedObjects": 337500,
"nscanned": 337500,
"scanAndOrder": false,
"indexOnly": false,
"nChunkSkips": 0,
"indexBounds": {
"Name": [
[
"blabla",
"blabla"
]
]
}
},
{
"cursor": "BtreeCursor Date_1",
"isMultiKey": false,
"n": 0,
"nscannedObjects": 101,
"nscanned": 102,
"scanAndOrder": false,
"indexOnly": false,
"nChunkSkips": 0,
"indexBounds": {
"Date": [
[
"2014-08-23 10:28:50.221",
"2014-08-23 13:28:50.221"
]
]
}
},
{
"cursor": "Complex Plan",
"n": 0,
"nscannedObjects": 0,
"nscanned": 103,
"nChunkSkips": 0
}
The complex plan shows nothing. And the elapsed time is 16s. If I query only by name without date, it takes only 0.9s
I want to learn how to write query using index intersection in mongojava driver, something like hint() in mongo shell. Any example or tutorial link is welcome.
I know about writing basic queries with Mongodb java driver. You can just post the essential code example if it saves ur time.
Thanks in advance.
After reading these links: http://docs.mongodb.org/manual/core/query-plans/#index-filters
https://jira.mongodb.org/browse/SERVER-3071
I come to conclude that there is no way for now to force query to use index intersection.
In fact, when several candidate index are possible for a query, mongodb runs them in parallel and waits a index to "win the match". The winner index is the one that completes the whole query first or returns a threshold number of matching result first. Then mongodb uses this index to query.
In the case that your queries are very variant and you cannot build many compound index, its dead. You can only trust mongodb's test.
Sometimes, one index is more selective than another. But it doesn't mean that it returns more quickly the result. Like my case, the "name" index is more selective. It may fetch less documents. But it requires a date comparaison to determine if the fetched document matches the whole query. On the other side, the "date" index fetches more documents from the disque but only does a simple equality test on the "name" field to determine if the document matches the query. That is possibly why it can win the test.
About the index intersection, it has never been used in my several query tests. I doubt if it is useful and expect mongodb to improve its performance in future version.
If my conclusion is wrong, please point it out. Still learning about MongoDB :)
Does mongodb apply automatically index intersection when the query
involves 2 fields which are separately indexed by a single index?
has been answered here: MongoDB index intersection
You can't force MongoDB to apply index intersections rather you could modify your queries to allow MongoDB query optimizer to apply index intersection strategy on your query.
To learn how your query parameters affect the indexing process, see this link, though it is for compound indexes.
http://java.dzone.com/articles/optimizing-mongodb-compound
And Java API provides two methods to use hint() with the find() operation:
MongoDB Java API
public DBCursor hint(String indexName)
public DBCursor hint(DBObject indexKeys)
Informs the database of indexed fields of the collection in order to
improve performance.
which can be used as below,
List obj = collection.find( query ).hint(indexName);
I am trying to query Alphanumeric values from the index using TERMS QUERY, But it is not giving me the output.
Query:
{
"size" : 10000,
"query" : {
"bool" : {
"must" : {
"terms" : {
"caid" : [ "A100945","A100896" ]
}
}
}
},
"fields" : [ "acco", "bOS", "aid", "TTl", "caid" ]
}
I want to get all the entries that has caid A100945 or A100896
The same query works fine for NUmeric fields.
I am not planning to use QueryString/MatchQuery as i am trying to build general query builder that can build query for all the request. Hence am looking to get the entries usinng TERMS Query only.
Note: I am using Java API org.elasticsearch.index.query.QueryBuilders for building the Query.
eg: QueryBuilders.termQuery("caid", "["A10xxx", "A101xxx"]")
Please help.
Regards,
Mik
If you have not customized the mappings/analysis for the caid-field, then your values are indexed as e.g. a100945, a100896 (note the lowercasing.)
The terms-query does not do query-time text-analysis, so you'll be searching for A100945 which does not match a100945.
This is quite a common problem, and is explained a bit more in this article on Troubleshooting Elasticsearch searches, for Beginners.
You better use match query.match query are analyzed[applied default analyzer and query] like
QueryBuilders.matchQuery("caid", "["A10xxx", "A101xxx"]");
I have the following document in my collection:
{
"_id":NumberLong(106379),
"_class":"x.y.z.SomeObject",
"name":"Some Name",
"information":{
"hotelId":NumberLong(106379),
"names":[
{
"localeStr":"en_US",
"name":"some Other Name"
}
],
"address":{
"address1":"5405 Google Avenue",
"city":"Mountain View",
"cityIdInCitiesCodes":"123456",
"stateId":"CA",
"countryId":"US",
"zipCode":"12345"
},
"descriptions":[
{
"localeStr":"en_US",
"description": "Some Description"
}
],
},
"providers":[
],
"some other set":{
"a":"bla bla bla",
"b":"bla,bla bla",
}
"another Property":"fdfdfdfdfdf"
}
I need to run through all documents in collection and if "providers": [] is empty I need to create new set based on values of information section.
I'm far from being MongoDB expert, so I have the few questions:
Can I do it as atomic operation?
Can I do this using MongoDB console? as far as I understood I can do it using $addToSet and $each command?
If not is there any Java based driver that can provide such functionality?
Can I do it as atomic operation?
Every document will be updated in an atomic fashion. There is no "atomic" in MongoDB in the sense of RDBMS, meaning all operations will succeed or fail, but you can prevent other writes interleaves using $isolated operator
Can I do this using MongoDB console?
Sure you can. To find all empty providers array you can issue a command like:
db.zz.find(providers :{ $size : 0}})
To update all documents where the array is of zero length with a fixed set of string, you can issue a query such as
db.zz.update({providers : { $size : 0}}, {$addToSet : {providers : "zz"}})
If you want to add a portion to you document based on a document's data, you can use the notorious $where query, do mind the warnings appearing in that link, or - as you had mentioned - query for empty provider array, and use cursor.forEach()
If not is there any Java based driver that can provide such functionality?
Sure, you have a Java driver, as for each other major programming language. It can practically do everything described, and basically every thing you can do from the shell. Is suggest you to get started from the Java Language Center.
Also there are several frameworks which facilitate working with MongoDB and bridge the object-document world. I will not give a least here as I'm pretty biased, but I'm sure a quick Google search can do.
db.so.find({ providers: { $size: 0} }).forEach(function(doc) {
doc.providers.push( doc.information.hotelId );
db.so.save(doc);
});
This will push the information.hotelId of the corresponding document into an empty providers array. Replace that with whatever field you would rather insert into the providers array.