AWS DynamoDb read BatchRead using DynamoDbEnhancedClient

AWS DynamoDb read BatchRead using DynamoDbEnhancedClient - java

BatchGetResultPageIterable batchResults = dynamoDbEnhancedClient
.batchGetItem(BatchGetItemEnhancedRequest.builder()
.readBatches(ReadBatch.builder(Analysis.class).mappedTableResource(analysisTable)
.addGetItem(GetItemEnhancedRequest.builder()
.key(Key.builder().partitionValue(projectId).build()).build())
.build())
.build());
batchResults.forEach(page -> page.resultsForTable(analysisTable)
.forEach(item -> System.out.println(item.getFileId())));
I used the above one.. but i am facing the issue
software.amazon.awssdk.services.dynamodb.model.DynamoDbException: The provided key element does not match the schema (Service: DynamoDb, Status Code: 400, Request ID: 36V6D9GAGEUA817ODOGV52PF6VVV4KQNAO5AEMVJF66Q9ASUAAJG)
AnalysisTable
=================
PartitionKey Sort Key Name
A===============>1.txt=====>ABC
A===============>2.txt=====>DEF
A===============>3.txt=====>GHI
A===============>4.txt=====>JKL
class Analysis {
private String projectId;
private String sampleId;
private String sampleName;
private String description;
//setter & getters
}

Since you have partition and sort keys, I think you are missing the .sortValue() in your Key.builder()
It needs to be Key.builder().partitionValue(projectId).sortValue(your-sort-key).build()

If you want everything under a certain partition key, this how you can do it:
List<DTO> DTOs = pageToList(mappedTable.query(
QueryEnhancedRequest
.builder()
.queryConditional(
QueryConditional.keyEqualTo(
Key.builder().partitionValue(projectId).build()
)
)
.build()
private List<DTO> pageToList(SdkIterable<Page<DTO>> iterable) {
List<DTO> rVal = new ArrayList<>();
iterable.forEach(item -> rVal.addAll(item.items()));
return rVal;
}
Note that this assumes you are using the latest enhanced dependency:
software.amazon.awssdk
dynamodb-enhanced

Related

Kafka Connect. OffsetStorageReader throws exception, only at the first poll(). On the next poll() gets sourceOffset without exception

Line
context.offsetStorageReader().offset(sourcePartition());
produces exception at the first poll.
On the next polling, there is no exception. Is it possible to fix it without wrapping extra checking around getLatestSourceOffset() like adding field to determine if it's the first poll? Or there is no way to avoid it and we should add checking?
kafka-connect-api version: 0.10.2.0-cp1
2022-06-19 05:52:34,538 ERROR [pool-1-thread-1] (OffsetStorageReaderImpl.java:102) - CRITICAL: Failed to deserialize offset data when getting offsets for task with namespace CryptoPanicSourceConnector. No value for this data will be returned, which may break the task or cause it to skip some data. This could either be due to an error in the connector implementation or incompatible schema.
org.apache.kafka.connect.errors.DataException: JsonConverter with schemas.enable requires "schema" and "payload" fields and may not contain additional fields. If you are trying to deserialize plain JSON data, set schemas.enable=false in your converter configuration.
at org.apache.kafka.connect.json.JsonConverter.toConnectData(JsonConverter.java:309)
at org.apache.kafka.connect.storage.OffsetStorageReaderImpl.offsets(OffsetStorageReaderImpl.java:96)
at org.apache.kafka.connect.storage.OffsetStorageReaderImpl.offset(OffsetStorageReaderImpl.java:54)
at com.delphian.bush.CryptoPanicSourceTask.getLatestSourceOffset(CryptoPanicSourceTask.java:97)
at com.delphian.bush.CryptoPanicSourceTask.poll(CryptoPanicSourceTask.java:61)
at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:162)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:139)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:182)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
worker.properties
bootstrap.servers=localhost:29092
key.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=true
value.converter=org.apache.kafka.connect.json.JsonConverter
value.converter.schemas.enable=true
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=true
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter.schemas.enable=true
rest.port=8086
rest.host.name=127.0.0.1
offset.storage.file.filename=offsets/standalone.offsets
offset.flush.interval.ms=10000
SourceTask
public List<SourceRecord> poll() throws InterruptedException {
List<SourceRecord> records = new ArrayList<>();
Optional<Long> sourceOffset = getLatestSourceOffset();
CryptoNewsResponse newsResponse = // getNewsFromApi
// Filter which news add to records based on sourceOffset. Shortened for brevity
for (CryptoNews news : filteredNews) {
records.add(generateRecordFromNews(news));
}
return records;
}
private Optional<Long> getLatestSourceOffset() {
Map<String, Object> offset = context.offsetStorageReader().offset(sourcePartition());
if (offset != null) {
Object id = offset.get("id");
if (id != null) {
Long latestOffset = Long.valueOf((String) id);
return Optional.of(latestOffset);
}
}
return Optional.empty();
}
private SourceRecord generateRecordFromNews(CryptoNews cryptoNews) {
return new SourceRecord(
sourcePartition(),
sourceOffset(cryptoNews),
config.getString(TOPIC_CONFIG),
null,
CryptoNewsSchema.NEWS_KEY_SCHEMA,
buildRecordKey(cryptoNews),
CryptoNewsSchema.NEWS_SCHEMA,
buildRecordValue(cryptoNews),
Instant.now().toEpochMilli()
);
}
private Map<String, String> sourceOffset(CryptoNews cryptoNews) {
Map<String, String> map = new HashMap<>();
map.put(CryptoNewsSchema.ID_FIELD, cryptoNews.getId());
return map;
}
UPDATE
I don't use Avro and Protobuf.
My news schema:
public static final Schema NEWS_SCHEMA = SchemaBuilder.struct()
.name(SCHEMA_NAME)
.version(FIRST_VERSION)
.field(NewsSourceSchema.SCHEMA_NAME, SOURCE_SCHEMA)
.field(CurrencySchema.SCHEMA_NAME, SchemaBuilder.array(CURRENCY_SCHEMA).optional())
.field(KIND_FIELD, Schema.OPTIONAL_STRING_SCHEMA)
.field(DOMAIN_FIELD, Schema.OPTIONAL_STRING_SCHEMA)
.field(TITLE_FIELD, Schema.OPTIONAL_STRING_SCHEMA)
.field(PUBLISHED_AT_FIELD, Schema.OPTIONAL_STRING_SCHEMA)
.field(SLUG_FIELD, Schema.OPTIONAL_STRING_SCHEMA)
.field(ID_FIELD, Schema.OPTIONAL_STRING_SCHEMA)
.field(URL_FIELD, Schema.OPTIONAL_STRING_SCHEMA)
.field(CREATED_AT_FIELD, Schema.OPTIONAL_STRING_SCHEMA)
.build();
public Struct toConnectData(CryptoNews cryptoNews) {
Struct struct = new Struct(CryptoNewsSchema.NEWS_SCHEMA)
.put(NewsSourceSchema.SCHEMA_NAME, NewsSourceConverter.INSTANCE.toConnectData(cryptoNews.getSource()))
.put(CryptoNewsSchema.KIND_FIELD, cryptoNews.getKind())
.put(CryptoNewsSchema.DOMAIN_FIELD, cryptoNews.getDomain())
.put(CryptoNewsSchema.TITLE_FIELD, cryptoNews.getTitle())
.put(CryptoNewsSchema.PUBLISHED_AT_FIELD, cryptoNews.getPublishedAt())
.put(CryptoNewsSchema.SLUG_FIELD, cryptoNews.getSlug())
.put(CryptoNewsSchema.ID_FIELD, cryptoNews.getId())
.put(CryptoNewsSchema.URL_FIELD, cryptoNews.getUrl())
.put(CryptoNewsSchema.CREATED_AT_FIELD, cryptoNews.getCreatedAt());
List<Currency> currencies = Optional.ofNullable(cryptoNews.getCurrencies()).orElse(new ArrayList<>());
final List<Struct> items = currencies.stream()
.map(CONVERTER::toConnectData)
.collect(Collectors.toList());
struct.put(CurrencySchema.SCHEMA_NAME, items);
return struct;
}
UPDATE 2
connector.properties
name=CryptoPanicSourceConnector
tasks.max=1
connector.class=com.delphian.bush.CryptoPanicSourceConnector
topic=crypto-news
Startup command:
connect-standalone config/worker.properties config/custom-connector.properties

When using plain JSON data with Connect, you may see this error message: org.apache.kafka.connect.errors.DataException: JsonDeserializer with schemas.enable requires "schema" and "payload" fields and may not contain additional fields.
You will need to set the schemas.enable parameters for the converter to false for plain text with no schema.
bootstrap.servers=localhost:29092
key.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=false
value.converter=org.apache.kafka.connect.json.JsonConverter
value.converter.schemas.enable=false

Extract String from Mono<String> and assign to object

I want to extract string from API response and assign that to a object. When print that value it will display but couldn't assign to a object. Please check below sample code base.
List<DtvPrePacksDataResponse> dtvPrePacksDataResponses =
channelGroupByGenre.entrySet().stream()
.map(e -> {
DtvPrePacksDataResponse dtvPrePacksDataResponse = new DtvPrePacksDataResponse();
dtvPrePacksDataResponse.setCatId(counter.incrementAndGet());
dtvPrePacksDataResponse.setCategoryName(e.getKey());
dtvPrePacksDataResponse.setChannelCount(e.getValue().size());
String channelCat = e.getValue().get(0).getProductCategoryId();
String channelSubcat = e.getValue().get(0).getProductSubCategoryId();
this.getChannelCatIcon(channelCat, channelSubcat)
.subscribe(
s -> this.catIconSet(dtvPrePacksDataResponse, s)
//s -> dtvPrePacksDataResponse.setCategoryIcon(s)
/*value -> System.out.println(value),
error -> error.printStackTrace(),
() -> System.out.println("completed without a value")*/
);
return dtvPrePacksDataResponse;
}).collect(Collectors.toList());
Retrive image from API response
private Mono<String> getChannelCatIcon(String channelCat, String channelSubcat) {
return productSubcategoryQueryClientInterface.getProductSubcategoryInfo(
channelCat,
channelSubcat
)
.collectList()
.map(productSubcategories -> {
ProductSubcategory productSubcategory = productSubcategories.stream()
.filter(productSubcategory1 -> productSubcategory1.getProductCategoryId().equals(channelCat) && productSubcategory1.getProductSubCategoryId().equals(channelSubcat))
.findFirst()
.orElseThrow(() -> new RuntimeException(INVALID_LANGUAGE_OBJECT));
ListSubcategoryUrl listSubcategoryUrl = productSubcategory.getListSubcategoryUrl().stream()
.filter(listSubcategoryUrl1 -> listSubcategoryUrl1.getProductUrlCode().equals("SUBCATEGORY_IMAGE") && listSubcategoryUrl1.getType().equals("IMAGE"))
.findFirst()
.orElseThrow(() -> new RuntimeException(INVALID_LANGUAGE_OBJECT));
return listSubcategoryUrl.getUrl();
}).cache();
}
Set Image to the object
private void catIconSet(DtvPrePacksDataResponse dtvPrePacksDataResponse, String url){
System.out.println("img url " + url);
dtvPrePacksDataResponse.setCategoryIcon(url);
}
Entity class
public class DtvPrePacksDataResponse {
private int catId;
private String categoryName;
private String categoryIcon;
private int channelCount;
private List<DtvPrePacksDataListResponse> channelList;
}
Using block() gives a error. Tried using flatmap() also gives error. Checked below question also and tried but no success
How to get String from Mono<String> in reactive java
How to extract string from Mono<String> in reactor core

Keycloak: how to delete and edit a Group

I already tried a bunch of different ways and none of them work.
(First of all im using this, and works with other methods, like create/delete user, create group etc etc)
public void startKeycoak(String username, String password) {
Keycloak kc = KeycloakBuilder.builder()
.serverUrl(uri)
.realm(realmName)
.username(username)
.password(password)
.clientId(client)
.resteasyClient(
new ResteasyClientBuilder()
.connectionPoolSize(10).build())
.build();
this.kc = kc;
}
Problem starts here:
public void deleteGroup(String groupName) {
GroupRepresentation groupRepresentation = kc.realm(realmName)
.groups()
.groups()
.stream()
.filter(group -> group.getName().equals(groupName)).collect(Collectors.toList()).get(0);
// kc.realm(realmName).groups().group(existingGroups.getName()).remove(); -> Not Working
// boolean a = kc.realm(realmName).groups().groups().remove(groupRepresentation); -> Not Workings - returns a false
}
public void updateGroup(String newName, String oldName) {
GroupRepresentation groupRepresentation = kc.realm(realmName)
.groups()
.groups()
.stream()
.filter(group -> group.getName().equals(oldName)).collect(Collectors.toList()).get(0);
//groupRepresentation.setName(newName); -> 1 - Not working
//kc.realm(realmName).groups().groups().stream().filter(g -> { -> 2 - Not Working
//g.setName(oldName);
//return false;
//});
}
Like I said before its working with a lot of methods except those two.

kc.realm(realmName).groups().group(groupRepresentation.getId()).remove();
try to delete it with the group representation id it works.

PubsubToBQ tableCreation before insert

I'm trying to make a life Bigquery table creation before the inserting process itself. Here is the code of PTransform that I'm using -> Link
This transform I would like to apply on Pubsub messages that would be inserted in BQ table later.
Phase 1. Getting pubsub messages:
PCollection<PubsubMessage> messages =
pipeline.apply(
"ReadPubSubSubscription",
PubsubIO.readMessagesWithAttributes()
.fromSubscription(options.getInputSubscription()));
Phase 2. Convert all pubsub messages to TableRow:
PCollectionTuple convertedTableRows =
messages
.apply("ConvertMessageToTableRow", new PubsubMessageToTableRow(options));
Phase 3. Here is the problem, I need to check if table exist and upload the result to BQ:
###here is the schema for our BQ table
public static final Schema schema1 =
Schema.of(
Field.of("name", StandardSQLTypeName.STRING),
Field.of("post_abbr", StandardSQLTypeName.STRING));
### here is the method that we using to extract table name from pubsub attributes
static class PubSubAttributeExtractor implements SerializableFunction<ValueInSingleWindow<TableRow>, String> {
private final String attribute;
public PubSubAttributeExtractor(String attribute) {
this.attribute = attribute;
}
#Override
public String apply(ValueInSingleWindow<TableRow> input) {
TableRow row = input.getValue();
String tableName = (String) row.get("name");
return "my-project:myDS.pubsub_" + tableName;
}
}
### here is the part that doesn't work
WriteResult writeResult = convertedTableRows.get(TRANSFORM_OUT)
.apply(new BigQueryAutoCreateTable(
new PubSubAttributeExtractor("event_name"),schema1));
.apply(
"WriteSuccessfulRecords",
BigQueryIO.writeTableRows()
.withoutValidation()
.withCreateDisposition(CreateDisposition.CREATE_NEVER)
.withWriteDisposition(WriteDisposition.WRITE_APPEND)
.withExtendedErrorInfo()
.withMethod(BigQueryIO.Write.Method.STREAMING_INSERTS)
.withFailedInsertRetryPolicy(InsertRetryPolicy.retryTransientErrors())
.to(new ProbPartitionDestinations(options.getOutputTableSpec())
)
);
Error logs:
cannot find symbol
symbol: method apply(java.lang.String,org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO.Write<com.google.api.services.bigquery.model.TableRow>)
location: interface org.apache.beam.sdk.values.POutput

Nested Query in DynamoDB returns nothing

I'm using DynamoDB with the Java SDK, but I'm having some issues with querying nested documents. I've included simplified code below. If I remove the filter expression, then everything gets returned. With the filter expression, nothing is returned. I've also tried using withQueryFilterEntry(which I'd prefer to use) and I get the same results. Any help is appreciated. Most of the documentation and forums online seem to use an older version of the java sdk than I'm using.
Here's the Json
{
conf:
{type:"some"},
desc: "else"
}
Here's the query
DynamoDBQueryExpression<JobDO> queryExpression = new DynamoDBQueryExpression<PJobDO>();
queryExpression.withFilterExpression("conf.Type = :type").addExpressionAttributeValuesEntry(":type", new AttributeValue(type));
return dbMapper.query(getItemType(), queryExpression);

Is it a naming issue? (your sample json has "type" but the query is using "Type")
e.g. the following is working for me using DynamoDB Local:
public static void main(String [] args) {
AmazonDynamoDBClient client = new AmazonDynamoDBClient(new BasicAWSCredentials("akey1", "skey1"));
client.setEndpoint("http://localhost:8000");
DynamoDBMapper mapper = new DynamoDBMapper(client);
client.createTable(new CreateTableRequest()
.withTableName("nested-data-test")
.withAttributeDefinitions(new AttributeDefinition().withAttributeName("desc").withAttributeType("S"))
.withKeySchema(new KeySchemaElement().withKeyType("HASH").withAttributeName("desc"))
.withProvisionedThroughput(new ProvisionedThroughput().withReadCapacityUnits(1L).withWriteCapacityUnits(1L)));
NestedData u = new NestedData();
u.setDesc("else");
Map<String, String> c = new HashMap<String, String>();
c.put("type", "some");
u.setConf(c);
mapper.save(u);
DynamoDBQueryExpression<NestedData> queryExpression = new DynamoDBQueryExpression<NestedData>();
queryExpression.withHashKeyValues(u);
queryExpression.withFilterExpression("conf.#t = :type")
.addExpressionAttributeNamesEntry("#t", "type") // returns nothing if use "Type"
.addExpressionAttributeValuesEntry(":type", new AttributeValue("some"));
for(NestedData u2 : mapper.query(NestedData.class, queryExpression)) {
System.out.println(u2.getDesc()); // "else"
}
}
NestedData.java:
#DynamoDBTable(tableName = "nested-data-test")
public class NestedData {
private String desc;
private Map<String, String> conf;
#DynamoDBHashKey
public String getDesc() { return desc; }
public void setDesc(String desc) { this.desc = desc; }
#DynamoDBAttribute
public Map<String, String> getConf() { return conf; }
public void setConf(Map<String, String> conf) { this.conf = conf; }
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

AWS DynamoDb read BatchRead using DynamoDbEnhancedClient - java

Since you have partition and sort keys, I think you are missing the .sortValue() in your Key.builder() It needs to be Key.builder().partitionValue(projectId).sortValue(your-sort-key).build()

Related

Kafka Connect. OffsetStorageReader throws exception, only at the first poll(). On the next poll() gets sourceOffset without exception

Extract String from Mono<String> and assign to object

Keycloak: how to delete and edit a Group

PubsubToBQ tableCreation before insert

Nested Query in DynamoDB returns nothing

Categories

Resources