I'm using Logstash to input data from my database to Elasticsearch.
For an specific SQL query, I have one column that retrieves values as a CSV, like "role1;role2;role3".
This column is being indexed as a regular string in Elastic.
The problem:
I need to make an Elastic query on that field based on another list of values.
For example: On the java side I have a collection with the values: "role3", "role4", "role5" and based on that I should get all the records in Elastic that matches "role3", "role4" or "role5".
In this specific case, my elastic data is like this:
"_source": {
"userName": "user1",
"roles": "role1;role2;role3"
}
"_source": {
"userName": "user2",
"roles": "role7;role8;role9"
}
In this case it should return the record for "user1", as it gets a match for the "role3".
Question:
What is the best way to do that ?
I can make a query using something like the LIKE operator for all itens of my java list:
//javaList collection has 3 items: "role3", "role4" and "role5"
for (String role: javaList) {
query = QueryBuilders.boolQuery();
query.should(QueryBuilders.wildcardQuery("roles", "*" + role + "*"));
response = client.prepareSearch(indexName).setQuery(query).setTypes(type).execute().actionGet();
hits = response.getHits();
}
And then iterate over each hit, but this sounds like a very bad apporach, because the javaList can have more than 20 itens, that would mean 20 querys to elastic.
I need a way to tell this to Elastic:
This is my list of roles, query internally and retrieve
only the records that matches at least one of those roles.
In order to do that I understand I can't index that data as a String right ? Ideally would be to have it an array or something like it...
How can I do that in the most performatic way ?
Definitely you should not use wildcard query in a loop. This solution eventually will demonstrate a poor performance.
Since roles field is a regular text field Elasticsearch splits value "role1;role2;role3" into individual tokens "role1", "role2" and "role3". The same operation is applied to a search query. You can use simple match query with query string "role3;role4;role5" and get hit because of "role3" token match.
Also you can index roles field as an array of strings and the same match query will still work.
Related
Below is an example of my "Event" document in MongoDB. I want to be able to query all of the Event documents where the attendees array contains "623d03730e82c57fefa52fb2" (a user ID).
Here is one of my event documents:
_id: ObjectId(623ce74372a28f08dea6c959)
description: "Fun BBQ to celebrate my 21st!"
host: "623d03730e82c57fefa52fb2"
invitees: Array
location: "My address..."
name: "Fun Birthday BBQ"
private: true
date: "03/28/22"
end: "11:15 PM"
start: "06:35 PM"
attendees:Array
0: "623d03730e82c57fefa52fb2"
Here is my broken query code:
String id = "623d03730e82c57fefa52fb2";
// I have also tried Document queryFilter = new Document("attendees", id);
Document queryFilter = new Document("attendees", new Document("$in", Arrays.asList(id)));
The above code always returns an empty result. To clarify I am using Java and MongoDB Realms but that shouldn't matter.
You don't need $in, use only $eq is ok.
db.collection.find({
attendees: "623d03730e82c57fefa52fb2"
})
mongoplayground
For easier and more efficient queries, it's important that the types of field values remain consistent.
For example, if "_id" is stored at an ObjectId, then query parameters should also be of type ObjectId. Likewise, if they are strings, then consistently use strings.
If different value types are stored for individual field values, successful queries can still be possible, but not as efficiently since the field types must be considered in the queries. For example, if trying to find a doc by a field that may have a string or an ObjectId type, the query must either search for both types, or the query writer must know the type beforehand. It's easier and more efficient to just pick one type for a field and stick to it.
I have a DynamoDB table that holds items with the following structure:
{
"applicationName": { //PARTITION KEY
"S": "FOO_APP"
},
"requestId": { // SORT KEY
"S": "zzz/yyy/xxx/58C28B6"
},
"creationTime": {
"N": "1636332219136"
},
"requestStatus": {
"S": "DENIED"
},
"resolver": {
"S": "SOMEONE"
}
}
In DynamoDB, can I query this table to List all items that match the provided values for applicationName, requestStatus and, resolver?
In other words, how can I list all items that match:
applicationName = 'FOO',
requestStatus = 'DENIED', and
resolver = 'SOMEONE'
With this table design, do I need GSIs? Can I do a Query or would it be a Scan?
What is the most cost-effective, efficient way of accomplishing this task?
I'm using Java's DynamoDBMapper.
You can add another attribute that combines the values you're querying for, like this:
GSI1PK: <applicationName>#<requestStatus>#<resolver>
Then you define a Global Secondary Index (GSI1) with the Partition Key as GSI1PK and the sort key like your current sort key requestId.
Whenever you want to find all requests that match these three conditions, you build your search thing and query the global secondary index:
Query #GSI1
Partition Key = FOO_APP#DENIED#SOMEONE
That will yield all requests that match the combination of criteria. This kind of denormalization is common in NoSQL databases like DynamoDB.
You may not be able to query this schema as your sort key - requestId is not in criteria. That means, your query will fail. For a better schema design, you should have a sort key in such a way which can help you narrow down result set obtained by just querying on PartitionKey.
So for solution, you will have to create new index as following:
applicationName -> Partition Key
requestStatus -> Sort Key
resolver
Then you can query with keyConditionExpression on applicationName and requestStatus with filterExpression on resolver.
I have so many documents in a collection and would like to change one of the filed name in all the documents. Also, want to change the value with some prefixed constant in all the docs.
Example,
{
"_id" : ObjectId("56e9e6e9083378ba4e5e8832"),
"name" : "Mike"
}
Should be changed to,
{
"_id" : ObjectId("56e9e6e9083378ba4e5e8832"),
"firstName" : "First-Mike"
}
I used the following Java code to rename the field,
final MongoDatabase mongoDb = mongo.getDatabase(database);
final MongoCollection<Document> collection = mongoDb.getCollection("<CollectionName>");
Bson rename = Updates.rename("name", "firstName");
collection.updateMany(new Document(), rename);
But not sure, how to change the value with some prefixed constants for all the documents in the collection.
I can iterate all the documents in the collection and do the change, but trying to understand, if there is any way we can do this without iterating all the documents, like single update.
Thanks
Why not use regular expressions for prefix matching? A single update command can do your job if you use mongodb's regex:
https://docs.mongodb.org/manual/reference/operator/query/regex/
I want to make a Hibernate query to check if a string contains a substring.
Suppose a user class having id,name,info.
info is String which contain multiple substrings.
For example info contains strings like "hi I am from Pune".
I want to read all record which contain Pune as substring.
I tried using like query but not working.
Criteria criteria = session.createCriteria(Post.class);
criteria.add(Restrictions.like("content",contentStringToLook));
users = (List<Post>)criteria.list();
Try modifying the restriction as follows:
criteria.add(Restrictions.like("content","%"+contentStringToLook+"%"));
You can use this :
criteria.add(Restrictions.like("content",contentStringToLook, MatchMode.ANYWHERE))
There's also MatchMode.START, .END, and .EXACT.
I've started to fiddle with mongo db and came up with a question.
Say, I have an object (POJO) with an id field (say, named 'ID') that I would like to represent in JSON and store/load in/from Mongo DB.
As far as I understood any object always has _id field (with underscore, lowercased).
What I would like to do is: during the query I would like the mongo db to return me my JSON with field ID instead of _id.
In SQL I would use something like
SELECT _id as ID ...
My question is whether its possible to do this in mongo db, and if it is, the Java based Example will be really appreciated :)
I understand that its possible to iterate over the records and substitute the _id with ID manually but I don't want this O(n) loop.
I also don't really want to duplicate the lines and store both "id" and "_id"
So I'm looking for solution at the level of query or maybe Java Driver.
Thanks in advance and have a nice day
Mongodb doesnt use SQL , its more like Object Query Language and Collections.
what you can try is , some thing similar to below code using Mongo Java Driver
Pojo obj = new PojoInstance();
obj.setId(id);
db.yourdb.find(obj);
I've end up using the following approach in the Java Driver:
DBCursor cursor = runSomeQuery();
try {
while(cursor.hasNext()) {
DBObject dbObject = cursor.next();
ObjectId id = (ObjectId) dbObject.get("_id");
dbObject.removeField("_id");
dbObject.put("ID", id.toString());
System.out.println(dbObject);
}
} finally {
cursor.close();
}
I was wondering whether this is the best solution or I have other better options
Mark
Here's an example of what I am doing in Javascript. It may be helpful to you. In my case I am removing the _id field and aliasing the two very nested fields to display simpler names.
db.players.aggregate([
{ $match: { accountId: '12345'}},
{ $project: {
"_id": 0,
"id": "$id",
"masterVersion": "$branches.master.configuration.player.template.version",
"previewVersion": "$branches.preview.configuration.player.template.version"
}
}
])
I hope you find this helpful.