stateStore.delete(key) in Kafka is not working - java

I have what it thought would be a simple statestore use case. We loop through a state store every 10s and try to send to a partner, if we receive 404, we try again next intervall.
If we receive 200, we delete the entry from the state store.
In my test (1 entry in statestore) I first let it run a few loops, where we receive 404, just to test that the retry works. When I switch my mock endpoint to return 200, I can see through the logs that both:
stateStore.delete(key) and stateStore.flush() is called. I even confirm after stateStore.delete(key) that stateStore.get(key) returns a null value (tombstone).
However, the next time the punctuator runs (10s), the object is still in the state store and the entire block is called again. it keeps looping like this, without ever deleting the entry in the statestore
#Override
public void punctuate(long l) {
log.info("PeriodicRetryPunctuator started: " + l);
try(KeyValueIterator<String, TestEventObject> iter = stateStore.all()) {
while(iter.hasNext()) {
KeyValue<String, TestEventObject> keyValue = iter.next();
String key = keyValue.key;
TestEventObject event = keyValue.value;
try {
log.info("Event: " + event);
// Sends event over HTTP. Will throw HttpResponseException if 404 is received
eventService.processEvent(event);
stateStore.delete(key);
stateStore.flush();
// Check that statestore returns null
log.info("Check: " + stateStore.get(key));
} catch (HttpResponseException hre) {
log.info("Periodic retry received 404. Retrying at next interval");
}
catch (Exception e) {
e.printStackTrace();
log.error("Exception with periodic retry: {}", e.getMessage());
}
}
}
}

Update:
It seems to be Confluent's encryption libraries that causes these issues. I've done quite an extensive A/B test, and every time it occurs is with Confluent encryption. Without I never experience this issue.

Related

Unable to DM (send direct message) to individual user (users) using slack api in Java

I am trying to send message to an individual user using slack api in Java, but I am not able to. I tried various ways, but none helped. I am able to send messages to channels though but not to user directly.
Following is my code -
public void postToUsers() {
var client = Slack.getInstance().methods();
try {
ChatPostMessageResponse response = Slack.getInstance().methods().chatPostMessage(r -> r
.token("xoxb-2850123307073-2837743345451-e3Y8y4cahtzeLAKIbjUHMMuC")
.channel("U02ABCGV9V")
.text("hello"));
System.out.println("::::" + response.getMessage());
} catch (NoSuchElementException noSuchElementException) {
logger.error("No record for slack notification exists for id: " + id);
);
} catch (IOException | SlackApiException e) {
logger.error("error: {}", e.getMessage(), e);
e.printStackTrace();
);
}
}
Instead of posting to user whose userid I copied in channel above, I see the message under app name in slack. I am not sure what's happening. Following is the response I get -
Message(type=message, subtype=null, team=T02R09Q9125, channel=null, user=U02QMMV6JD9, username=null, text=hello, blocks=null, attachments=null, ts=1639724593.000400, threadTs=null, intro=false, starred=false, wibblr=false, pinnedTo=null, reactions=null, botId=B02QF0PEFQE, botLink=null, displayAsBot=false, botProfile=BotProfile(id=B02QF0PEFQE, deleted=false, name=slack-notification-service, updated=1639470131, appId=A02Q72AKEFR, icons=BotProfile.Icons(image36=https://a.slack-edge.com/80588/img/plugins/app/bot_36.png, image48=https://a.slack-edge.com/80588/img/plugins/app/bot_48.png, image72=https://a.slack-edge.com/80588/img/plugins/app/service_72.png), teamId=T02R09Q9125), icons=null, file=null, files=null, upload=false, parentUserId=null, inviter=null, clientMsgId=null, comment=null, topic=null, purpose=null, edited=null, unfurlLinks=false, unfurlMedia=false, threadBroadcast=false, replies=null, replyCount=null, replyUsers=null, replyUsersCount=null, latestReply=null, subscribed=false, xFiles=null, lastRead=null, root=null, itemType=null, item=null)
Channel coming as null.
Following are the scopes I have given in order to see if it works, but it didn't -
One first would first have to create the conversation:
https://api.slack.com/methods/conversations.create
And then translate from string conversation (name) to conversationId (in case not already known), then it can be addressed in a direct message. See "picking the right conversation". The API has further methods: https://api.slack.com/docs/conversations-api#methods - even if the raw API isn't the Java client, it clearly shows what may be possible, no matter the client.

Azure ServiceBusSessionReceiverAsyncClient - Mono instead of Flux

I have a Spring Boot app, where I receive one single message from a Azure Service Bus queue session.
The code is:
#Autowired
ServiceBusSessionReceiverAsyncClient apiMessageQueueIntegrator;
.
.
.
Mono<ServiceBusReceiverAsyncClient> receiverMono = apiMessageQueueIntegrator.acceptSession(sessionid);
Disposable subscription = Flux.usingWhen(receiverMono,
receiver -> receiver.receiveMessages(),
receiver -> Mono.fromRunnable(() -> receiver.close()))
.subscribe(message -> {
// Process message.
logger.info(String.format("Message received from quque. Session id: %s. Contents: %s%n", message.getSessionId(),
message.getBody()));
receivedMessage.setReceivedMessage(message);
timeoutCheck.countDown();
}, error -> {
logger.info("Queue error occurred: " + error);
});
As I am receiving only one message from the session, I use a CountDownLatch(1) to dispose of the subscription when I have received the message.
The documentation of the library says that it is possible to use Mono.usingWhen instead of Flux.usingWhen if I only expect one message, but I cannot find any examples of this anywhere, and I have not been able to figure out how to rewrite this code to do this.
How would the pasted code look if I were to use Mono.usingWhen instead?
Thank you conniey. Posting your suggestion as an answer to help other community members.
By default receiveMessages() is a Flux because we imagine the messages from a session to be "infinitely long". In your case, you only want the first message in the stream, so we use the next() operator.
The usage of the countdown latch is probably not the best approach. In the sample, we had one there so that the program didn't end before the messages were received. .subscribe is not a blocking call, it sets up the handlers and moves onto the next line of code.
Mono<ServiceBusReceiverAsyncClient> receiverMono = sessionReceiver.acceptSession("greetings-id");
Mono<ServiceBusReceivedMessage> singleMessageMono = Mono.usingWhen(receiverMono,
receiver -> {
// Anything you wish to do with the receiver.
// In this case we only want to take the first message, so we use the "next" operator. This returns a
// Mono.
return receiver.receiveMessages().next();
},
receiver -> Mono.fromRunnable(() -> receiver.close()));
try {
// Turns this into a blocking call. .block() waits indefinitely, so we have a timeout.
ServiceBusReceivedMessage message = singleMessageMono.block(Duration.ofSeconds(30));
if (message != null) {
// Process message.
}
} catch (Exception error) {
System.err.println("Error occurred: " + error);
}
You can refer to GitHub issue:ServiceBusSessionReceiverAsyncClient - Mono instead of Flux

How to ensure messages reach kafka broker?

I have have a message producer on my local machine and a broker on remote host (aws).
After sending a message from the producer,
I wait and call the console consumer on the remote host and
see excessive logs.
Without the value from producer.
The producer flushes the data after calling the send method.
Everything is configured correctly.
How can I check to see that the broker received the message from the producer and to see if the producer received the answer?
The Send method asynchronously sends the message to the topic and
returns a Future of RecordMetadata.
java.util.concurrent.Future<RecordMetadata> send(ProducerRecord<K,V> record)
Asynchronously sends a record to a topic
After the flush call,
check to see that the Future has completed by calling the isDone method.
(for example, Future.isDone() == true)
Invoking this method makes all buffered records immediately available to send (even if linger.ms is greater than 0) and blocks on the completion of the requests associated with these records. The post-condition of flush() is that any previously sent record will have completed (e.g. Future.isDone() == true). A request is considered completed when it is successfully acknowledged according to the acks configuration you have specified or else it results in an error.
The RecordMetadata contains the offset and the partition
public int partition()
The partition the record was sent to
public long offset()
the offset of the record, or -1 if {hasOffset()} returns false.
Or you can also use Callback function to ensure messages was sent to topic or not
Fully non-blocking usage can make use of the Callback parameter to provide a callback that will be invoked when the request is complete.
here is clear example in docs
ProducerRecord<byte[],byte[]> record = new ProducerRecord<byte[],byte[]>("the-topic", key, value);
producer.send(myRecord,
new Callback() {
public void onCompletion(RecordMetadata metadata, Exception e) {
if(e != null) {
e.printStackTrace();
} else {
System.out.println("The offset of the record we just sent is: " + metadata.offset());
}
}
});
You can try get() API of send , which will return the Future of RecordMetadata
ProducerRecord<String, String> record =
new ProducerRecord<>("SampleTopic", "SampleKey", "SampleValue");
try {
producer.send(record).get();
} catch (Exception e) {
e.printStackTrace();
}
Use exactly-once-delivery and you won't need to worry about whether your message reached or not: https://www.baeldung.com/kafka-exactly-once, https://www.confluent.io/blog/exactly-once-semantics-are-possible-heres-how-apache-kafka-does-it/

CompletionStage.thenCompose not executing serially

I'm trying to use java 8 CompletionStages to execute 2 asynchronous method serially, so that the second is not executed if the first fails. But when I call thenCompose, the function passed in seems to get started before the previous function is complete (eg: the two function erroneously execute in parallel. Here is the code:
public CompletionStage<Graph> create(Payload payload) {
CompletionStage<BlobInfo> fileFuture = createFile(payload);
CompletionStage<Entity> metadataFuture = createMetadata(payload);
return fileFuture
.thenCompose(ignore -> metadataFuture)
.thenApply(entity ->
buildFromEntity(objectMapper, entity));
}
public CompletionStage<BlobInfo> createFile(Payload payload) {
return CompletableFuture.supplyAsync(() -> {
try {
return
storage.create(
BlobInfo
.newBuilder(payload.bucket, payload.name)
.build(),
payload.data.getBytes());
} catch (StorageException e) {
LOG.error("Failed to write to storage: " + e);
throw new RequestHandlerException(StatusCode.SERVER_ERROR,
"Failed to write to storage.");
}
});
}
public CompletionStage<Entity> createMetadata(Payload payload) {
return CompletableFuture.supplyAsync(() -> createSync(payload));
}
private Entity createMetadataSync(Payload payload) {
Key key = keyFactory.newKey(payload.id);
Entity.Builder entityBuilder = GraphPayload.buildEntityFromGraph(payload, key);
Entity entity = entityBuilder.build();
LOG.error("Metadata.createSync");
try {
datastore.add(entity);
} catch (DatastoreException e) {
LOG.error("Failed to write initial metadata: " + e);
throw new RequestHandlerException(StatusCode.SERVER_ERROR,
"Failed to write initial metadata.");
}
return entity;
}
OUTPUT:
16:57:47.530 [ForkJoinPool.commonPool-worker-3] ERROR com.spotify.nfgraphstore.store.FileStore - CreateFile
16:57:47.530 [ForkJoinPool.commonPool-worker-2] ERROR com.spotify.nfgraphstore.store.MetadataStore - Metadata.createSync
16:57:47.530 [ForkJoinPool.commonPool-worker-3] ERROR com.spotify.nfgraphstore.store.FileStore - Failed to write initial graph to storage: com.google.cloud.storage.StorageException: X
The logged output demonstrates that Metadata.createSync is getting executed before the Storage exception gets thrown. This conclusion is also born out by a test (not shown) which is supposed to show zero interactions with the metadata DB if the write to the file storage DB fails. That test sometimes fails, suggesting a race condition.
So I'm left thinking thenCompose does not guarantee serial execution. But everything I've read in the java docs suggests execution should be serial: https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/CompletionStage.html#thenCompose-java.util.function.Function-
Does anyone know why execution is not guaranteed to be serial, or recommend other functions that might work more as I've intended?
The call to createMetadata launches the task immediately, because it is not called as part of the lambda expression passed to thenCompose.
Perhaps you meant to do this:
.thenCompose(ignore -> createMetadata(payload))

Parallel processing using collection of CompletableFuture supplyAsync then collecting results

//Unit of logic I want to make it to run in parallel
public PagesDTO convertOCRStreamToDTO(String pageId, Integer pageSequence) throws Exception {
LOG.info("Get OCR begin for pageId [{}] thread name {}",pageId, Thread.currentThread().getName());
OcrContent ocrContent = getOcrContent(pageId);
OcrDTO ocrData = populateOCRData(ocrContent.getInputStream());
PagesDTO pageDTO = new PagesDTO(pageId, pageSequence.toString(), ocrData);
return pageDTO;
}
Logic to execute convertOCRStreamToDTO(..) in parallel then collect its results when individuals thread execution is done
List<PagesDTO> pageDTOList = new ArrayList<>();
//javadoc: Creates a work-stealing thread pool using all available processors as its target parallelism level.
ExecutorService newWorkStealingPool = Executors.newWorkStealingPool();
Instant start = Instant.now();
List<CompletableFuture<PagesDTO>> pendingTasks = new ArrayList<>();
List<CompletableFuture<PagesDTO>> completedTasks = new ArrayList<>();
CompletableFuture<<PagesDTO>> task = null;
for (InputPageDTO dcInputPageDTO : dcReqDTO.getPages()) {
String pageId = dcInputPageDTO.getPageId();
task = CompletableFuture
.supplyAsync(() -> {
try {
return convertOCRStreamToDTO(pageId, pageSequence.getAndIncrement());
} catch (HttpHostConnectException | ConnectTimeoutException e) {
LOG.error("Error connecting to Redis for pageId [{}]", pageId, e);
CaptureException e1 = new CaptureException(Error.getErrorCodes().get(ErrorCodeConstants.REDIS_CONNECTION_FAILURE),
" Connecting to the Redis failed while getting OCR for pageId ["+pageId +"] " + e.getMessage(), CaptureErrorComponent.REDIS_CACHE, e);
exceptionMap.put(pageId,e1);
} catch (CaptureException e) {
LOG.error("Error in Document Classification Engine Service while getting OCR for pageId [{}]",pageId,e);
exceptionMap.put(pageId,e);
} catch (Exception e) {
LOG.error("Error getting OCR content for the pageId [{}]", pageId,e);
CaptureException e1 = new CaptureException(Error.getErrorCodes().get(ErrorCodeConstants.TECHNICAL_FAILURE),
"Error while getting ocr content for pageId : ["+pageId +"] " + e.getMessage(), CaptureErrorComponent.REDIS_CACHE, e);
exceptionMap.put(pageId,e1);
}
return null;
}, newWorkStealingPool);
//collect all async tasks
pendingTasks.add(task);
}
//TODO: How to avoid unnecessary loops which is happening here just for the sake of waiting for the future tasks to complete???
//TODO: Looking for the best solutions
while(pendingTasks.size() > 0) {
for(CompletableFuture<PagesDTO> futureTask: pendingTasks) {
if(futureTask != null && futureTask.isDone()){
completedTasks.add(futureTask);
pageDTOList.add(futureTask.get());
}
}
pendingTasks.removeAll(completedTasks);
}
//Throw the exception cought while getting converting OCR stream to DTO - for any of the pageId
for(InputPageDTO dcInputPageDTO : dcReqDTO.getPages()) {
if(exceptionMap.containsKey(dcInputPageDTO.getPageId())) {
CaptureException e = exceptionMap.get(dcInputPageDTO.getPageId());
throw e;
}
}
LOG.info("Parallel processing time taken for {} pages = {}", dcReqDTO.getPages().size(),
org.springframework.util.StringUtils.deleteAny(Duration.between(Instant.now(), start).toString().toLowerCase(), "pt-"));
Please look at my above code base todo items, I have below two concerns for which I am looking for advice over stackoverflow:
1) I want to avoid unnecessary looping (happening in while loop above), what is the best way for optimistically I wait for all threads to complete its async execution then collect my results out of it??? Please anybody has an advice?
2) ExecutorService instance is created at my service bean class level, thinking that, it will be re-used for every requests, instead create it local to the method, and shutdown in finally. Am I doing right here?? or any correction in my thought process?
Simply remove the while and the if and you are good:
for(CompletableFuture<PagesDTO> futureTask: pendingTasks) {
completedTasks.add(futureTask);
pageDTOList.add(futureTask.get());
}
get() (as well as join()) will wait for the future to complete before returning a value. Also, there is no need to test for null since your list will never contain any.
You should however probably change the way you handle exceptions. CompletableFuture has a specific mechanism for handling them and rethrowing them when calling get()/join(). You might simply want to wrap your checked exceptions in CompletionException.

Categories