Fetch n latest records from Dynamo DB

Fetch n latest records from Dynamo DB - java

I am having a dynamo db with a GSI on GSI_ID and range key on lastUpdatedId attribute. I want only 1 task to be rendered which is the latest on the basis of match. This is the expression I am using
DynamoDBQueryExpression<Record> exp =
new DynamoDBQueryExpression<GoldDataRecord>()
.withIndexName(GSI_ID)
.withKeyConditionExpression(KEY_EXP)
.withExpressionAttributeValues(attrValues)
.withLimit(1)
// to get the results in descending order of last updated date giving the latest data first
.withScanIndexForward(false)
.withSelect(Select.ALL_ATTRIBUTES)
.withConsistentRead(false);
private DynamoDBMapper dynamoDBMapper;
PaginatedQueryList<Record> res = dynamoDBMapper.query(Record.class, exp);
But it is fetching all the records that are matching the GSI. What am I doing wrong here?

Related

Cloud firestore get all documents between two ids of documents

I have a collection with all document ids as epochtime(1613728796). Each of these documents contains up to 50 fields in it. I wanted to query set of documents between specific timing. How I can query based on document's uid?
Query query = db.collection("my-collection").whereGreaterThan("uid", "1613728796")

Try this:
Query query = db.collection("my-collection").whereGreaterThan("__name__", "1613728796").whereLessThan("__name__", "1613728796")
Replace the above epoch times with the correct ones.
If that doesn't work, try replacing "__name" with FieldPath.documentId() or FieldPath.documentId

How do I restrict the Solr search to get only the latest matching record less than a particular date?

H,Essentially In an EMPLOYEE Table that can have multiple records for an employee with different ENTRY_DATE I want to restrict the results of below query by 1 for each employee
SELECT * FROM EMPLOYEE WHERE EMP_NAME IN (...) AND ENTRY_DATE < TODAY ORDER BY ENTRY_DATE DESC
Currently I am fetching all the matching results from the repo and finding the latest record in my Java code, But that creates a problem when emp_list is large and total results are greater than the MaxSize of results, In that case few employees are skipped.
Is there a way to query directly the latest record from the repo in Solr search?
Any help would be appreciated.

Below Solr query gives the required result
q=EMP_NAME&group=true&group.field=EMP_NAME&group.limit=1&sort=ENTRY_DATE desc
Also you can search multiple EMP_NAME
q="EMP_NAME1" OR "EMP_NAME2"&group=true&group.field=EMP_NAME&group.limit=1&sort=ENTRY_DATE desc
Result Grouping -> https://lucene.apache.org/solr/guide/8_1/result-grouping.html

DynamoDB: how to query with multiple filter

I have a table and the structure looks like this:
my table structure
Here correlationId is my hashKey.
I can perform simple query using hashKey:
DynamoDBMapper mapper = new DynamoDBMapper(dynamoDB);
Pickup itemRetrieved = mapper.load(Pickup.class, key);
Now I want to query on basis of fields i.e correlationId, partnerId to get transactionId.
How should I do that?

Here is the sample code with multiple filter.
List<Pickup> pickupList = null;
DynamoDBMapper dynamoDBMapper = new DynamoDBMapper(dynamoDBClient);
Pickup pickup = new Pickup();
pickup.setCorrelationId(correlationId);
Map<String, AttributeValue> attributeValues = new HashMap<>();
attributeValues.put(":partnerId", new AttributeValue(partnerId));
DynamoDBQueryExpression<Pickup> queryExpression = new DynamoDBQueryExpression<Pickup>().withHashKeyValues(pickup)
.withFilterExpression("partnerId = :partnerId")
.withExpressionAttributeValues(attributeValues);
pickupList = dynamoDBMapper.query(Pickup.class, queryExpression);
pickupList.stream().forEach(i -> System.out.println(i.toString()));

Your partition key(correlation Id) is one keys on which you want to retrieve transactionid but it's missing partnerid.
Hence do these 3 steps
Step 1 - build a global secondary index on partnerid
Step 2 - filter on partition id
Step 3 - get transaction id

Query Filtering
DynamoDB’s Query function retrieves items using a primary key or an index key from a Local or Global Secondary Index. Each query can use Boolean comparison operators to control which items will be returned.
With today’s release, we are extending this model with support for query filtering on non-key attributes. You can now include a QueryFilter as part of a call to the Query function. The filter is applied after the key-based retrieval and before the results are returned to you. Filtering in this manner can reduce the amount of data returned to your application while also simplifying and streamlining your code.
The QueryFilter that you pass to the Query API must include one or more conditions. Each condition references an attribute name and includes one or more attribute values, along with a comparison operator. In addition to the usual Boolean comparison operators, you can also use CONTAINS, NOT_CONTAINS, and BEGINS_WITH for string matching, BETWEEN for range checking, and IN to check for membership in a set.
In addition to the QueryFilter, you can also supply a ConditionalOperator. This logical operator (either AND or OR) is used to connect each of the elements in the QueryFilter.

Mongo DB / No duplicates

I have have a mongo collection that keeps state records for devices. Thus, there could be multiple records per device. What I would like to do is create a query through the mongoTemplate that gets the latest record for each device.
Here's the constraints:
Pass in a Set<'String'> name_ids, regular field within mongo collection not the _id or found within the _id
get only the latest record for each device with matching name_id
return List<'DeviceStateData'> (No duplicates should be found with the same name_id)
example of collection object:
{
_id: "241324123412",
name_id: "flyingMan",
powerState:"ON",
timeStamp: ISODate('')
}
Thanks

You should look on Distinct function.
Here you can find details with Spring.

Query Dynamo table with only the secondary global index

Im trying to query a Dynamodb table using a secondary global index and I'm getting java.lang.IllegalArgumentException: Illegal query expression: No hash key condition is found in the query. All I'm trying to do is to get all items that have a timestamp greater than a value without considering the key. The timestamp is not part of a key or range key, so i created a global index for it.
Does anyone have a clue what i might be missing?
Table Definition:
{
AttributeDefinitions:[
{
AttributeName:timestamp,
AttributeType:N
},
{
AttributeName:url,
AttributeType:S
}
],
TableName:SitePageIndexed,
KeySchema:[
{
AttributeName:url,
KeyType:HASH
}
],
TableStatus:ACTIVE,
CreationDateTime: Mon May 12 18:45:57 EDT 2014,
ProvisionedThroughput:{
NumberOfDecreasesToday:0,
ReadCapacityUnits:8,
WriteCapacityUnits:4
},
TableSizeBytes:0,
ItemCount:0,
GlobalSecondaryIndexes:[
{
IndexName:TimestampIndex,
KeySchema:[
{
AttributeName:timestamp,
KeyType:HASH
}
],
Projection:{
ProjectionType:ALL,
},
IndexStatus:ACTIVE,
ProvisionedThroughput:{
NumberOfDecreasesToday:0,
ReadCapacityUnits:8,
WriteCapacityUnits:4
},
IndexSizeBytes:0,
ItemCount:0
}
]
}
Code
Condition condition1 = new Condition().withComparisonOperator(ComparisonOperator.GE).withAttributeValueList(new AttributeValue().withN(Long.toString(start)));
DynamoDBQueryExpression<SitePageIndexed> exp = new DynamoDBQueryExpression<SitePageIndexed>().withRangeKeyCondition("timestamp", condition1);
exp.setScanIndexForward(true);
exp.setLimit(100);
exp.setIndexName("TimestampIndex");
PaginatedQueryList<SitePageIndexed> queryList = client.query(SitePageIndexed.class,exp);

All I'm trying to do is to get all items that have a timestamp greater than a value without considering the key.
This is not how Global Secondary Indexes (GSI) on Amazon DynamoDB work. To query a GSI you must specify a value for its hash key and then you may filter/sort by the range key -- just like you'd do with the primary key. This is exactly what the exception is trying to tell you, and also what you will find on the documentation page for the Query API:
A Query operation directly accesses items from a table using the table primary key, or from an index using the index key. You must provide a specific hash key value.
Think of a GSI as just another key that behaves almost exactly like the primary key (the main differences being that it is updated asynchronously, and you can only perform eventually consistent reads on GSIs).
Please refer to the Amazon DynamoDB Global Secondary Index documentation page for guidelines and best practices when creating GSIs: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSI.html
One possible way to achieve what you want would be to have a dummy attribute constrained to a finite, small set of possible values, create a GSI with hash key on that dummy attribute and range key on your timestamp. When querying, you would need to issue one Query API call for each possible value on your dummy hash key attribute, and then consolidate the results on your application. By constraining the dummy attribute to a singleton (i.e., a Set with a single element, i.e., a constant value), you can send only one Query API call and you get your result dataset directly -- but keep in mind that this will cause you problems related to hot partitions and you might have performance issues! Again, refer to the document linked above to learn the best practices and some patterns.

It is possible to query DynamoDb with only the GSI; could be confirmed by going to the web interaface Query/Index.
Programatically the way it is done is as following:
DynamoDB dynamoDB = new DynamoDB(new AmazonDynamoDBClient(
new ProfileCredentialsProvider()));
Table table = dynamoDB.getTable("WeatherData");
Index index = table.getIndex("PrecipIndex");
QuerySpec spec = new QuerySpec()
.withKeyConditionExpression("#d = :v_date and Precipitation = :v_precip")
.withNameMap(new NameMap()
.with("#d", "Date"))
.withValueMap(new ValueMap()
.withString(":v_date","2013-08-10")
.withNumber(":v_precip",0));
ItemCollection<QueryOutcome> items = index.query(spec);
Iterator<Item> iter = items.iterator();
while (iter.hasNext()) {
System.out.println(iter.next().toJSONPretty());
}
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSIJavaDocumentAPI.html#GSIJavaDocumentAPI.QueryAnIndex
For doing it with DynamoDBMapper see: How to query a Dynamo DB having a GSI with only hashKeys using DynamoDBMapper

Here is how you can query in java with only GSI
Map<String, AttributeValue> eav = new HashMap<String, AttributeValue>();
eav.put(":val1", new AttributeValue().withS("PROCESSED"));
DynamoDBQueryExpression<Package> queryExpression = new DynamoDBQueryExpression<Package>()
.withIndexName("<your globalsecondaryindex key name>")
.withKeyConditionExpression("your_gsi_column_name= :val1").
withExpressionAttributeValues(eav).withConsistentRead(false).withLimit(2);
QueryResultPage<T> scanPage = dbMapper.queryPage(T.class, queryExpression);

While this is not the correct answer per say, could you possible accomplish this with a scan vs. a query? It's much more expensive, but could be a solution.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.