Query Dynamo table with only the secondary global index - java

Im trying to query a Dynamodb table using a secondary global index and I'm getting java.lang.IllegalArgumentException: Illegal query expression: No hash key condition is found in the query. All I'm trying to do is to get all items that have a timestamp greater than a value without considering the key. The timestamp is not part of a key or range key, so i created a global index for it.
Does anyone have a clue what i might be missing?
Table Definition:
{
AttributeDefinitions:[
{
AttributeName:timestamp,
AttributeType:N
},
{
AttributeName:url,
AttributeType:S
}
],
TableName:SitePageIndexed,
KeySchema:[
{
AttributeName:url,
KeyType:HASH
}
],
TableStatus:ACTIVE,
CreationDateTime: Mon May 12 18:45:57 EDT 2014,
ProvisionedThroughput:{
NumberOfDecreasesToday:0,
ReadCapacityUnits:8,
WriteCapacityUnits:4
},
TableSizeBytes:0,
ItemCount:0,
GlobalSecondaryIndexes:[
{
IndexName:TimestampIndex,
KeySchema:[
{
AttributeName:timestamp,
KeyType:HASH
}
],
Projection:{
ProjectionType:ALL,
},
IndexStatus:ACTIVE,
ProvisionedThroughput:{
NumberOfDecreasesToday:0,
ReadCapacityUnits:8,
WriteCapacityUnits:4
},
IndexSizeBytes:0,
ItemCount:0
}
]
}
Code
Condition condition1 = new Condition().withComparisonOperator(ComparisonOperator.GE).withAttributeValueList(new AttributeValue().withN(Long.toString(start)));
DynamoDBQueryExpression<SitePageIndexed> exp = new DynamoDBQueryExpression<SitePageIndexed>().withRangeKeyCondition("timestamp", condition1);
exp.setScanIndexForward(true);
exp.setLimit(100);
exp.setIndexName("TimestampIndex");
PaginatedQueryList<SitePageIndexed> queryList = client.query(SitePageIndexed.class,exp);

All I'm trying to do is to get all items that have a timestamp greater than a value without considering the key.
This is not how Global Secondary Indexes (GSI) on Amazon DynamoDB work. To query a GSI you must specify a value for its hash key and then you may filter/sort by the range key -- just like you'd do with the primary key. This is exactly what the exception is trying to tell you, and also what you will find on the documentation page for the Query API:
A Query operation directly accesses items from a table using the table primary key, or from an index using the index key. You must provide a specific hash key value.
Think of a GSI as just another key that behaves almost exactly like the primary key (the main differences being that it is updated asynchronously, and you can only perform eventually consistent reads on GSIs).
Please refer to the Amazon DynamoDB Global Secondary Index documentation page for guidelines and best practices when creating GSIs: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSI.html
One possible way to achieve what you want would be to have a dummy attribute constrained to a finite, small set of possible values, create a GSI with hash key on that dummy attribute and range key on your timestamp. When querying, you would need to issue one Query API call for each possible value on your dummy hash key attribute, and then consolidate the results on your application. By constraining the dummy attribute to a singleton (i.e., a Set with a single element, i.e., a constant value), you can send only one Query API call and you get your result dataset directly -- but keep in mind that this will cause you problems related to hot partitions and you might have performance issues! Again, refer to the document linked above to learn the best practices and some patterns.

It is possible to query DynamoDb with only the GSI; could be confirmed by going to the web interaface Query/Index.
Programatically the way it is done is as following:
DynamoDB dynamoDB = new DynamoDB(new AmazonDynamoDBClient(
new ProfileCredentialsProvider()));
Table table = dynamoDB.getTable("WeatherData");
Index index = table.getIndex("PrecipIndex");
QuerySpec spec = new QuerySpec()
.withKeyConditionExpression("#d = :v_date and Precipitation = :v_precip")
.withNameMap(new NameMap()
.with("#d", "Date"))
.withValueMap(new ValueMap()
.withString(":v_date","2013-08-10")
.withNumber(":v_precip",0));
ItemCollection<QueryOutcome> items = index.query(spec);
Iterator<Item> iter = items.iterator();
while (iter.hasNext()) {
System.out.println(iter.next().toJSONPretty());
}
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSIJavaDocumentAPI.html#GSIJavaDocumentAPI.QueryAnIndex
For doing it with DynamoDBMapper see: How to query a Dynamo DB having a GSI with only hashKeys using DynamoDBMapper

Here is how you can query in java with only GSI
Map<String, AttributeValue> eav = new HashMap<String, AttributeValue>();
eav.put(":val1", new AttributeValue().withS("PROCESSED"));
DynamoDBQueryExpression<Package> queryExpression = new DynamoDBQueryExpression<Package>()
.withIndexName("<your globalsecondaryindex key name>")
.withKeyConditionExpression("your_gsi_column_name= :val1").
withExpressionAttributeValues(eav).withConsistentRead(false).withLimit(2);
QueryResultPage<T> scanPage = dbMapper.queryPage(T.class, queryExpression);

While this is not the correct answer per say, could you possible accomplish this with a scan vs. a query? It's much more expensive, but could be a solution.

Related

Unique constraint on combination of multiple columns within a partition key in DynamoDB

I am new to DynamoDB. As part of a requirement in my current project I would like to have a unique constraint on combination of two columns (col1, col2) within a given partition key.
I could have achieved that by making combination col1+col2 a range key. But the problem is, either or both of these two columns might get updated. So if I try to update range key, DynamoDB will throw a exception.
I cannot achieve that in application level also as it is not single threaded. Also I cannot make a separate table with the given columns on which I want to impose unique constraint, as this will not solve the problem as application is distributed.
I have no other idea how to achieve that.
EDIT:
I am trying to solve with following approach:
DynamoDBSaveExpression saveExpression = new DynamoDBSaveExpression();
Map expected = new HashMap();
expected.put("ID", <----- this is partition key
new ExpectedAttributeValue(new AttributeValue(student.getID())).withExists(false));
expected.put("rollNum",
new ExpectedAttributeValue(new AttributeValue(student.getRollNum())).withExists(false));
expected.put("name",
new ExpectedAttributeValue(new AttributeValue(student.getName())).withExists(false));
saveExpression.setExpected(expected);
saveExpression.setConditionalOperator(ConditionalOperator.AND);
rbsDynamoDBClient.getDynamoDBMapper().save(student, saveExpression);
But I am getting following exception:
Caused by:
com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: One
or more parameter values were invalid: Value cannot be used when
Exists is false for Attribute: ID (Service: AmazonDynamoDBv2; Status
Code: 400; Error Code: ValidationException;
Same error is coming for other fields i.e. rollNum and name too.
Replace
expected.put("ID", new ExpectedAttributeValue(new AttributeValue(student.getID())).withExists(false));
with
expected.put("ID", new ExpectedAttributeValue(false));
In ExpectedAttributeValue you say what you expect as it's not what is in your object. When you expect it not to exists, the value checked is the one in your object, so you don't need to provide it

DynamoDB: how to query with multiple filter

I have a table and the structure looks like this:
my table structure
Here correlationId is my hashKey.
I can perform simple query using hashKey:
DynamoDBMapper mapper = new DynamoDBMapper(dynamoDB);
Pickup itemRetrieved = mapper.load(Pickup.class, key);
Now I want to query on basis of fields i.e correlationId, partnerId to get transactionId.
How should I do that?
Here is the sample code with multiple filter.
List<Pickup> pickupList = null;
DynamoDBMapper dynamoDBMapper = new DynamoDBMapper(dynamoDBClient);
Pickup pickup = new Pickup();
pickup.setCorrelationId(correlationId);
Map<String, AttributeValue> attributeValues = new HashMap<>();
attributeValues.put(":partnerId", new AttributeValue(partnerId));
DynamoDBQueryExpression<Pickup> queryExpression = new DynamoDBQueryExpression<Pickup>().withHashKeyValues(pickup)
.withFilterExpression("partnerId = :partnerId")
.withExpressionAttributeValues(attributeValues);
pickupList = dynamoDBMapper.query(Pickup.class, queryExpression);
pickupList.stream().forEach(i -> System.out.println(i.toString()));
Your partition key(correlation Id) is one keys on which you want to retrieve transactionid but it's missing partnerid.
Hence do these 3 steps
Step 1 - build a global secondary index on partnerid
Step 2 - filter on partition id
Step 3 - get transaction id
Query Filtering
DynamoDB’s Query function retrieves items using a primary key or an index key from a Local or Global Secondary Index. Each query can use Boolean comparison operators to control which items will be returned.
With today’s release, we are extending this model with support for query filtering on non-key attributes. You can now include a QueryFilter as part of a call to the Query function. The filter is applied after the key-based retrieval and before the results are returned to you. Filtering in this manner can reduce the amount of data returned to your application while also simplifying and streamlining your code.
The QueryFilter that you pass to the Query API must include one or more conditions. Each condition references an attribute name and includes one or more attribute values, along with a comparison operator. In addition to the usual Boolean comparison operators, you can also use CONTAINS, NOT_CONTAINS, and BEGINS_WITH for string matching, BETWEEN for range checking, and IN to check for membership in a set.
In addition to the QueryFilter, you can also supply a ConditionalOperator. This logical operator (either AND or OR) is used to connect each of the elements in the QueryFilter.

DynamoDB API: How can I build an "add JSON attribute if not present" update request?

I am trying to use the new Amazon DynamoDB JSON API to add/overwrite key-value pairs in a JSON attribute called "document". Ideally, I would like simply to structure my write calls to send the KV pairs to add to the attribute, and have Dynamo create the attribute if it does not already exist for the given primary key. However if I try this with just a straightforward UpdateItemSpec:
PrimaryKey primaryKey = new PrimaryKey("key_str", "mapKey");
ValueMap valuesMap = new ValueMap().withLong(":a", 1234L).withLong(":b", 1234L);
UpdateItemSpec updateSpec = new UpdateItemSpec().withPrimaryKey(primaryKey).withUpdateExpression("SET document.value1 = :a, document.value2 = :b");
updateSpec.withValueMap(valuesMap);
table.updateItem(updateSpec);
I get com.amazonaws.AmazonServiceException: The document path provided in the update expression is invalid for update, meaning DynamoDB could not find the given attribute named "document" to which to apply the update.
I managed to approximate this functionality with the following series of calls:
try {
// 1. Attempt UpdateItemSpec as if attribute already exists
} catch (AmazonServiceException e) {
// 2. Confirm the exception indicated the attribute was not present, otherwise rethrow it
// 3. Use a put-if-absent request to initialize an empty JSON map at the attribute "document"
// 4. Rerun the UpdateItemSpec call from the above try block
}
This works, but is less than ideal as it will require 3 calls to DynamoDB every time I add a new primary key to the table. I experimented a bit with the attribute_not_exists function that can be used in Update Expressions, but wasn't able to get it to work in the way I want.
Any Dynamo gurus out there have any ideas on how/whether this can be done?
I received an answer from Amazon Support that it is not actually possible to accomplish this with a single call. They did suggest to reduce the number of calls when adding the attribute for a new primary key from 3 to 2, by using the desired JSON map in the put-if-absent request rather than an empty map.

Mongodb avoid duplicate entries

I am newbie to mongodb. May I know how to avoid duplicate entries. In relational tables, we use primary key to avoid it. May I know how to specify it in Mongodb using java?
Use an index with the {unique:true} option.
// everyone's username must be unique:
db.users.createIndex({email:1},{unique:true});
You can also do this across multiple fields. See this section in the docs for more details and examples.
A unique index ensures that the indexed fields do not store duplicate values; i.e. enforces uniqueness for the indexed fields. By default, MongoDB creates a unique index on the _id field during the creation of a collection.
If you wish for null values to be ignored from the unique key, then you have to also make the index sparse (see here), by also adding the sparse option:
// everyone's username must be unique,
//but there can be multiple users with no email field or a null email:
db.users.createIndex({email:1},{unique:true, sparse:true});
If you want to create the index using the MongoDB Java Driver. Try:
Document keys = new Document("email", 1);
collection.createIndex(keys, new IndexOptions().unique(true));
This can be done using "_id" field although this use is discouraged.
suppose you want the names to be unique, then you can put the names in "_id" column and as you might know "_id" column is unique for each entry.
BasicDBObject bdbo = new BasicDBObject("_id","amit");
Now , no other entry can have name as "amit" in the collection.This can be one of the way you are asking for.
As of Mongo's v3.0 Java driver, the code to create the index looks like:
public void createUniqueIndex() {
Document index = new Document("fieldName", 1);
MongoCollection<Document> collection = client.getDatabase("dbName").getCollection("CollectionName");
collection.createIndex(index, new IndexOptions().unique(true));
}
// And test to verify it works as expected
#Test
public void testIndex() {
MongoCollection<Document> collection = client.getDatabase("dbName").getCollection("CollectionName");
Document newDoc = new Document("fieldName", "duplicateValue");
collection.insertOne(newDoc);
// this will throw a MongoWriteException
try {
collection.insertOne(newDoc);
fail("Should have thrown a mongo write exception due to duplicate key");
} catch (MongoWriteException e) {
assertTrue(e.getMessage().contains("duplicate key"));
}
}
Theon solution didn't work for me, but this one did:
BasicDBObject query = new BasicDBObject(<fieldname>, 1);
collection.ensureIndex(query, <index_name>, true);
I am not a Java programmer however you can probably convert this over.
MongoDB by default does have a primary key known as the _id you can use upsert() or save() on this key to prevent the document from being written twice like so:
var doc = {'name': 'sam'};
db.users.insert(doc); // doc will get an _id assigned to it
db.users.insert(doc); // Will fail since it already exists
This will stop immediately duplicates. As to multithread safe inserts under certain conditions: well, we would need to know more about your condition in that case.
I should add however that the _id index is unqiue by default.
using pymongo it looks like:
mycol.create_index("id", unique=True)
where myCol is the collection in the DB
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import pymongo
myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]
mycol.create_index("id", unique=True)
mydict = {"name": "xoce", "address": "Highway to hell 666", "id": 1}
x = mycol.insert_one(mydict)
Prevent mongoDB to save duplicate email
UserSchema.path('email').validate(async(email)=>{
const emailcount = await mongoose.models.User.countDocuments({email})
return !emailcount
}, 'Email already exits')
May this help ur question...
worked for me..
use in user model.
refer for explaination
THANKS...

MongoDB Composite Key

I'm just getting started with MongoDb and I've noticed that I get a lot of duplicate records for entries that I meant to be unique. I would like to know how to use a composite key for my data and I'm looking for information on how to create them. Lastly, I am using Java to access mongo and morphia as my ORM layer so including those in your answers would be awesome.
Morphia: http://code.google.com/p/morphia/
You can use objects for the _id field as well. The _id field is always unique. That way you kind of get a composite primary key:
{ _id : { a : 1, b: 1} }
Just be careful when creating these ids that the order of keys (a and b in the example) matters, if you swap them around, it is considered a different object.
The other possibility is to leave _id alone and create a unique compound index.
db.things.ensureIndex({firstname: 1, lastname: 1}, {unique: true});
//Deprecated since version 3.0.0, is now an alias for db.things.createIndex()
https://docs.mongodb.org/v3.0/reference/method/db.collection.ensureIndex/
You can create Unique Indexes on the fields of the document that you'd want to test uniqueness on. They can be composite as well (called compound key indexes in MongoDB land), as you can see from the documentation. Morphia does have a #Indexed annotation to support indexing at the field level. In addition with morphia you can define compound keys at the class level with the #Indexed annotation.
I just noticed that the question is marked as "java", so you'd want to do something like:
final BasicDBObject id = new BasicDBObject("a", aVal)
.append("b", bVal)
.append("c", cVal);
results = coll.find(new BasicDBObject("_id", id));
I use Morphia too, but have found (that while it works) it generates lots of errors as it tries to marshall the composite key. I use the above when querying to avoid these errors.
My original code (which also works):
final ProbId key = new ProbId(srcText, srcLang, destLang);
final QueryImpl<Probabilities> query = ds.createQuery(Probabilities.class)
.field("id").equal(key);
Probabilities probs = (Probabilities) query.get();
My ProbId class is annotated as #Entity(noClassnameStored = true) and inside the Probabilities class, the id field is #Id ProbId id;
I will try to explain with an example:
Create a table Music
Add Artist as a primary key
Now since artist may have many songs we have to figure out a sort key.
The combination of both will be a composite key.
Meaning, the Artist + SongTitle will be unique.
something like this:
{
"Artist" : {"s" : "David Bowie"},
"SongTitle" : {"s" : "changes"},
"AlbumTitle" : {"s" : "Hunky"},
"Genre" : {"s" : "Rock"},
}
Artist key above is: Partition Key
SongTitle key above is: sort key
The combination of both is always unique or should be unique. Rest are attributes which may vary per record.
Once you have this data structure in place you can easily append and scan as per your custom queries.
Sample Mongo queries for reference:
db.products.insert(json file path)
db.collection.drop(json file path)
db.users.find(json file path)

Categories