Need help understanding mongo query - java

Can anyone explain me this mongodb query: -
#Query("{'$or':[{owner:?0, companyId:?1, status:?2},{companyId:?1, membersDetails:{'$elemMatch':{memberId:?0, status:?2}}}]}")

This is a Spring Data annotation that will be attached to a method signature something like this:
#Query("{'$or':[{owner:?0, companyId:?1, status:?2},{companyId:?1, membersDetails:{'$elemMatch':{memberId:?0, status:?2}}}]}")
List<MyClass> findByClassThings(String owner, String companyId, String status);
Or whatever the method is actually called as also with the appropriate types.
The query would act on data in storage something like this:
{
"owner": "A",
"companyId": "B",
"status": "C",
"membersDetails": [
{ "memberId": "B", "status": "X" },
{ "memberId": "C", "status": "C" },
]
},
{
"owner": "B",
"companyId": "B",
"status": "C",
"membersDetails": [
{ "memberId": "A", "status": "C" }
]
}
So when that method was called with something like this:
List<MyClass> results = MyClass.findByClassThings("A","B","C");
It would match both of those documents for the following reasons:
The first document matches because the elements in "owner", "companyId" and "status" match all of the supplied parameters as specified in the first query element to the $or expression:
{owner:"A", companyId:"B", status:"C"}`
The second document matches because the fields supplied are all present in the second query element of the $or condition. Being "companyId" is present at the top of the document and "status" is a match for an array element that also have the "memberId" with the same matching value to the first parameter:
{companyId:"B", membersDetails:{'$elemMatch':{memberId:"A", status:"C"}}}
In the later case $elemMatch requires that "both" the conditions "must" be present within the array elements being queried for a single element.

Try this
db.collection.find({$or: [{'owner': ?0, 'companyId': ?1, status:?2}, {'companyId': ?1, 'membersDetails': {'$elemMatch':{memberId:?0, status:?2}}}]});
Thanks

Related

Return 1st object from the array matching any value from input array using Java 8

I have a list of values 1,2,3,4. It depends on my how I want to struct these values, be it Array, ArrayList, etc.
Then, I have the response coming from rest call which may or may not contain these values. My object is to return 1st object from the response's field4 which contains these values. The structure of response will be like below. In this case, I would like to return 2nd object from the array since 4 is the 1st match with given input.
{
"field1": "",
"field2": "null",
"responseArray": [
{
"field3": "abc",
"field4": "8",
"field5": "def"
},
{
"field3": "abc",
"field4": "4",
"field5": "def"
},
{
"field3": "abc",
"field4": "1",
"field5": "def"
}
]
}
I understand I can do brute-force method where I can traverse through each object of the response, then match field4 with given input values and once match is found, exit the loop so as to skip traversing rest of the loop. But, is there another effective way that can be used here specially with features from java8?
If you use a Set for your values, then something like this would work:
(assuming your responseArray is an array of objects that has a getField4 method)
Arrays.stream(responseArray).findFirst(e -> values.contains(e.getField4()));

How to sort and filter a field which is in other indices in elasticsearch?

For example, I have a index named 'student':
{
"id": "100",
"name": "Frank"
}
then there is another index named 'grade':
{
"id": "1"
"score": 95,
"studentId": "100"
}
how can I use one query to get a page of student and sort by score?
Can I use join query to search these two indices like MySQL?
This is what I want to get:
{
"id": "100",
"name": "Frank",
"score": "95"
},
{
...
}
Unfortunately no. Gotta normalize your data since joins are not possible in ES. More at elastic.co/guide/en/elasticsearch/guide/current/relations.html

Elasticsearch match complete array of terms

I need to match a complete array of terms with elasticsearch.
Only documents that have a array with the same elements should be returned.
There should be neither more elements nor a subset of elements in the document's array.
The order of elements does not matter.
Example:
filter:
id: ["a", "b"]
documents:
id: ["a", "b"] -> match
id: ["b", "a"] -> match
id: ["a"] -> no match
id: ["a", "b", "c"] -> no match
Eventually I want to use Java High Level REST Client to implement the query, though a example for elasticsearch dsl will do as well.
I'd like to propose something that will prevent you from maintaining a long chain of "must" conditions as soon as your requirements will change (e.g., imagine you have an array of six items to match). I'm going to rely on a script query, which might look like over-engineered but it will be easy to create a search template out of it (https://www.elastic.co/guide/en/elasticsearch/reference/7.5/search-template.html).
{
"query": {
"bool": {
"filter": {
"script": {
"script": {
"source": """
def ids = new ArrayList(doc['id.keyword']);
def param = new ArrayList(params.terms);
def isSameSize = ids.size() == param.size();
def isSameContent = ids.containsAll(param);
return isSameSize && isSameContent
""",
"lang": "painless",
"params": {
"terms": [ "a", "b" ]
}
}
}
}
}
}
}
This way, the only thing that you will need to change is the value of the terms parameter.
While this does not seem to be supported natively you could go ahead and use a script filter to achieve this behavior like so:
GET your_index/_search
{
"query": {
"bool": {
"must": [
{
"script": {
"script": "doc['tags'].values.length == 2"
}
},
{
"term": {
"tags": {
"value": "a"
}
}
},
{
"term": {
"tags": {
"value": "b"
}
}
}
]
}
}
}
The script filter limits the search result by the array size while the term filters specify the values of that array. Make sure to enable fielddata on the tags field in order to execute scripts on it.

Query a list in mongo

I have a mongo collection 'Student' with below documents
{
"_id" : ObjectId("5ccc2cded71acf061de1c2d8"),
"studentId" : "123",
"name" : "1",
"age" : NumberLong(0),
"section" : "A",
"state" : "State1",
"city" : "City1"
}
I have 100 documents with the above structure. Now i have a list with below structure
[{
"studentId": "123",
"state": "state1"
},
{
"studentId": "456",
"state": "state2"
}]
Is there any way in mongo that i can get the documents matching this list data in single db call. Iterating over list with criteria as studentId:123 and state:state1 will work but is how to get all list data without iterating in java?
All you need is a simple find query:
db.collection.find({$or: arr});
when arr is the sample array you showed.
You should note that mongo searches are case sensitive meaning with the sample array you gave no matches will be found since "state1" is not equal to "State1".

How do I check if a MongoDB object exists and create/update respectively?

I am developing a wireless network survey tool built with Java (Swing GUI) and a MongoDB data storage solution. I am new to MongoDB and hardly a Java guru so I need some help. I want to find if a network exists in my database and append heard points to the network document. If the network doesn't exist, I would like to create a document for that network and add the heard points. I have been trying to fix this for days but I just can't seem to wrap my head around the solution. Also, it would be nice if the BSSID was the unique id so I don't get any duplicate networks. My ideal data structure would look something like this:
{ 'bssid' : 'ca:fe:de:ad:be:ef',
'channel' : 6,
'heardpoints' : {
'point' : { 'lat' : 36.12345, 'long' : -75.234564 },
'point' : { 'lat' : 36.34567, 'long' : -75.345678 }
}
This is what I have tried so far. It seems to add the initial point but it does not add additional points after the first one was made.
BasicDBObject query = new BasicDBObject();
query.put("bssid", pkt[1]);
DBCursor cursor = coll.find(query);
if (!cursor.hasNext()) {
// Document doesnt exist so create one
BasicDBObject document = new BasicDBObject();
document.put("bssid", pkt[1]);
BasicDBObject heardpoints = new BasicDBObject();
BasicDBObject point = new BasicDBObject();
point.put("lat", latitude);
point.put("long", longitude);
heardpoints.put("point", point);
document.put("heardpoints", heardpoints);
coll.insert(document);
} else {
// Document exists so we will update here
DBObject network = cursor.next();
BasicDBObject heardpoints = new BasicDBObject();
BasicDBObject point = new BasicDBObject();
point.put("lat", latitude);
point.put("long", longitude);
heardpoints.put("point", point);
network.put("heardpoints", heardpoints);
coll.save(network);
}
I feel like I am way off the reservation on this one. Any support would help, thanks a lot!
UPDATE
I am using the upsert suggestion but still having some issue. No doubt this will work for me, I am just not doing it correctly. I am still not getting any new points past the first one added.
BasicDBObject query = new BasicDBObject("bssid", pkt[1]);
System.out.println(query);
DBCursor cursor = coll.find(query);
System.out.println(cursor);
try {
DBObject network = cursor.next();
System.out.println(network);
network.put("heardpoints", new BasicDBObject("point",
new BasicDBObject("lat", latitude)
.append("long", longitude)));
coll.update(query, network, true, false);
} catch (NoSuchElementException ex) {
System.err.println("mongo error");
} finally {
cursor.close();
}
You've got two ways to address this really, it just depends on how you actually want to use the data. In either case the first thing to address is your "ideal data structure", and mostly because it is invalid. This is the wrong part:
'heardpoints' : {
'point' : { 'lat' : 36.12345, 'long' : -75.234564 },
'point' : { 'lat' : 36.34567, 'long' : -75.345678 }
}
So this "hash/map" is invalid because you have the same "key" named twice. You cannot do that and you probably want and "array" instead, as well as something that you have a hope of using GeoSpatial queries on later when you want to:
Array Approach
"heardpoints": [
{
"geometry": {
"type": "Point",
"coordinates": [-75.234564, 36.12345 ]
},
"time": ISODate("2014-11-04T21:09:18.437Z")
},
{
"geometry": {
"type": "Point",
"coordinates": [ -75.345678, 36.34567 ]
},
"time": ISODate("2014-11-04T21:10:28.919Z")
}
]
And a correct ordering for "lon" and "lat" as how MongoDB and the GeoJSON spec it follows does it.
Now this is for the form where you are going to keep all of your "hearddata" in a "single document" per "bssid" value, with each location kept in an array. Note that this is not really necessarily and "upsert" per se, except in the first creation instance. The main intent is to "update" the same "bssid" value document. Just in shell form now with a Java syntax translation later:
db.collection.update(
{ "bssid": "ca:fe:de:ad:be:ef" },
{
"$setOnInsert": { "channel": 6 },
"$push": {
"heardpoints": {
"$each": [{
"geometry": {
"type": "Point",
"coordinates": [-75.234564, 36.12345 ]
},
"time": ISODate("2014-11-04T21:09:18.437Z")
}],
"$sort": { "time": -1 },
"$slice": 20
}
}
},
{ "upsert": true }
);
Whatever the language and API representation, there are basically two parts to a MongoDB update operation. Essentially this:
[ < Query >, < Update > ]
Depending on the API presentation there are technically "three" parts where the third is Options but on the basic consideration on the "upsert" option, it is important to understand how both the Query and Update document portions are handled in an update operation.
The most important thing to apply to the Update document is that it has two forms. If you just supply "keys" and "values" in a standard object form then whatever is supplied will "overwrite" any existing content in a matched document. The other form (which will be used in all examples) is to use "update operators" which allow "parts" of the document to be modified or "augmented". That is important distinction. But on with the examples.
On a blank collection or at least one where the specified "bssid" value does not exist, then a new document would be created containing that "bssid" field value. Additionally there is some other behavior that is going to happen.
There is a special "update operator" in here called $setOnInsert. Just like the conditions specified in the Query portion of the statement, any fields and values mentioned here are only "created" in the document when a "new" document is inserted. So if the document matching the query condition was found then none of the operations here are actually performed to change the found document. This is a good place to set initial values and also limit the write activity on the document to just the fields where it is required.
The second section in the Update document is another "update operator" called $push. As expected by the common term in computing languages, this "adds items" to an "array". So on document creation then a new array is made and the items are appended or otherwise added to the "existing" array content in the found document.
There are some interesting modifiers here which have their own purpose. $each is a modifier that allow more than one item to be sent to an operator like $push at a time. We are only using it for a single item, but it's use it generally required with the other two modifiers we are interested in.
The next is $sort which is applied to the array elements present in the document in order to "sort" them by the condition. In this case there is a "time" field on the array elements, so the "sort" makes sure that as new elements are added then the contents of the array is always ordered so that the "newest" entries are always at the front of the array.
The final there is $slice which is complementing $sort by essentially specifying a "capped amount" for the array. So just to make sure out documents never get too large, the $slice modifier, which would be applied "after" the $sort modifier has done it's work then "removes" any entries beyond the specified "maximum" entries, and maintains the "maximum" length at that number. So quite a useful feature.
Of course if you did not care about a "time" value then there is another way to handle this so that the "coordinate" data is only kept for "unique" combinations. That way is to use the $addToSet operator to manage array or "set" entries by itself:
db.collection.update(
{ "bssid": "ca:fe:de:ad:be:ef" },
{
"$setOnInsert": { "channel": 6 },
"$addToSet": {
"heardpoints": {
"$each": [{
"geometry": {
"type": "Point",
"coordinates": [-75.234564, 36.12345 ]
}
}]
}
}
},
{ "upsert": true }
);
Now that does not actually need the $each modifier, but it's just left there for a future point. $addToSet essentially looks at the existing array content and compares it do the element you have supplied. Where that data does not exactly match something already present in the array then it is added to the "set". Otherwise, nothing happens since the data is already there.
So if you just want the data collected for specific points where they vary then this is a good approach. But there is a "catch", and a couple actually that are worth mentioning.
Suppose you want to keep only 20 entries as was mentioned before. While $addToSet supports the $each modifier, unfortunately the other modifiers such as $slice are not supported. So you cant "maintain a cap" with a single update statement and you would in fact have to issue "two" update operations in order to achieve this:
db.collection.update(
{ "bssid": "ca:fe:de:ad:be:ef" },
{
"$setOnInsert": { "channel": 6 },
"$addToSet": {
"heardpoints": {
"$each": [{
"geometry": {
"type": "Point",
"coordinates": [-75.234564, 36.12345 ]
}
}]
}
}
},
{ "upsert": true }
);
db.collection.update(
{ "bssid": "ca:fe:de:ad:be:ef" },
{
"$setOnInsert": { "channel": 6 },
"$push": {
"heardpoints": {
"$each": [],
"$slice": 20
}
}
}
)
But even so we have a new problem here. Aside from now counting in "two" operations, keeping this cap has another problem, which basically is that a "set" is "not ordered" in any way. So you can limit the total number of items in the list with the second update, but there is no way to remove the "oldest" item for example.
In order to do this then you want a "time" field for the "last update", but yes there is a catch again. Once you supply a "time" value then the "distinct data" that makes a "set" is no longer true. An $addToSet operation considers the following to be two "different" entries as all fields and not just the "coordinate" data is considered:
"heardpoints": [
{
"geometry": {
"type": "Point",
"coordinates": [-75.234564, 36.12345 ]
},
"time": ISODate("2014-11-04T21:09:18.437Z")
},
{
"geometry": {
"type": "Point",
"coordinates": [-75.234564, 36.12345 ]
},
"time": ISODate("2014-11-04T21:10:28.919Z")
}
]
Where the intent is to just "update the time" on the existing point at the given coordinates, then you need to take a different approach. But again this is two updates and in reverse, you try to update a document first and then do something else if that does not succeed. Meaning the "upsert" attempt is the second operation:
var result = db.collection.update(
{
"bssid": "ca:fe:de:ad:be:ef",
"heardpoints.geometry.coordinates": [-75.234564, 36.12345 ]
},
{
"$set": {
"heardpoints.$.time": ISODate("2014-11-04T21:10:28.919Z")
}
}
);
// If result did not match and modify anything existing then perform the upsert
if ( ) {
db.collection.update(
{ "bssid": "ca:fe:de:ad:be:ef" }, // just this key and not the array
{
"$setOnInsert": { "channel": 6 },
"$push": {
"heardpoints": {
"$each": [{
"geometry": {
"type": "Point",
"coordinates": [-75.234564, 36.12345 ]
},
"time": ISODate("2014-11-04T21:09:18.437Z")
}],
"$sort": { "time": -1 },
"$slice": 20
}
}
},
{ "upsert": true }
);
}
So two sepations where one tries to "update" an existing array entry by first querying for that position. That first operation cannot be an upsert since it would create a new document with the same "bssid" and the array entry that was not found. If it could that would be, but this is not allowed with the positional $ operator which is using a matched position of the found element so that that element can be altered via the $set operator.
In the Java invocation there is a WriteResult type that is returned which can be used like this:
WriteResult writeResult = collection.update(query1, update1, false, false);
if ( writeResult.getN() == 0 ) {
// Upsert would be tried if the array item was not found
writeResult = collection.update(query2, update2, true, false);
}
If something was not updated then the serialized content looks like this:
{ "serverUsed" : "192.168.2.3:27017" , "ok" : 1 , "n" : 0 , "updatedExisting" : true}
Which means you basically nest the n value to see what happened and make your decision on whether to "update" the array item or "push" a new one depending on where the query matched that array item or not.
Document Approach
The general conclusion from the above is that where you want to keep distinct data for the "coordinates" and just modify a "time" entry then the above process can get messy. The operations are not ideally atomic, and though there can be some tuning, it is probably not well suited to high volume updates.
This is a case then where the logic is to "remove" the array storage, and then store each distinct "point" in it's own document with the related "bssid" field. This simplifies the case of whether to update or "insert" a new one into a single operation model. Documents in the collection now look like this:
{
"bssid": "ca:fe:de:ad:be:ef",
"channel": 6,
"geometry": {
"type": "Point",
"coordinates": [-75.234564, 36.12345 ]
},
"time": ISODate("2014-11-04T21:09:18.437Z")
},
{
"bssid": "ca:fe:de:ad:be:ef",
"channel": 6,
"geometry": {
"type": "Point",
"coordinates": [ -75.345678, 36.34567 ]
},
"time": ISODate("2014-11-04T21:10:28.919Z")
}
Distinct in their own collection and not bound in the same document under an array. There is data duplication but the "update" process is now much simplified:
db.collection.update(
{
"bssid": "ca:fe:de:ad:be:ef",
"geometry": {
"type": "Point",
"coordinates": [-75.234564, 36.12345 ]
}
},
{
"$setOnInsert": { "channel": 6 },
"$set": { "time": ISODate("2014-11-04T21:10:28.919Z") }
}
{ "upsert": true }
)
And all that does would be match a document based on the supplied "bssid" and "point" values either "updating" the "time" where it matched or just inserting a new document with all values where that "bssid" and "point" data was not found.
The overall case is that where this started off with simple needs and it was fine to "embed" the array into the array, maintaining more complex needs can be a possible pain to use that storage form. On the other hand, using separate documents in the collection has it's benefits on one side, but then you do have to do your own work to "clean up" entries beyond any cap limits you might want. But it is arguable that may not necessarily need to be a "real time" operation.
Different approaches, so work with the one that suits you best. This is just a guide to implement in either way and showing the pitfalls and solutions. What works best for you, only you can tell.
This really is more about the technique than the specific Java coding. That part is not hard, so here is just some of the most difficult structure from above for reference:
DBObject update = new BasicDBObject(
"$setOnInsert", new BasicDBObject(
"channel", 6
)
).append(
"$push", new BasicDBObject(
"heardpoints", new BasicDBObject(
"$each", new DBObject[]{
new BasicDBObject(
"geometry",
new BasicDBObject("type","Point").append(
"coordinates", new double[]{-75.234564, 36.12345}
)
).append(
"time", new DateTime(2014,1,1,0,0,DateTimeZone.UTC).toDate()
)
}
).append(
"$sort", new BasicDBObject(
"time", -1
)
).append("$slice", 20)
)
);

Categories