Avoid Duplicate message being sent to KAFKA in case of Application Crash - java

This rest end point is called every 1 hour.How can i avoid duplicate data being sent to Kafka?
http://localhost:8080/sendData
Rest end point call does the below 4 operation
Calls Procedure to Get data (lets say 100 messages are returned
by Procedure)
70 messages sent to Kafka .Suddenly my application
crashes
Then to send rest of the 30 messages .I call Procedure
again which again returns 100 messages .
so the 70 messages are
duplicated . //need to avoid the
1 solution to have a have flag(eg IS_SENT_TO_KAFKA) in database for each message(row) .
And update this flag to IS_SENT_TO_KAFKA='Y' is message was delived to KAFKA and IS_SENT_TO_KAFKA='N' if message was not sent to KAFKA
But the above solution has also one flaw .What if application crashes while updating the FLAG.Then also we would have 1 duplicate message
kafkaTemplate.send(topicName, message);
callback onSuccess() --- saveInDB ('Y') ;IS_SENT_TO_KAFKA='Y'
callback onFailure() --- saveInDB ('N') ;IS_SENT_TO_KAFKA='N'

Related

Processing just one service bus topic message at a time

I am subscribed to a Azure servicebus topic made by an external department. The way I want the code to work is as follows:
Trigger an http endpoint that starts processorClient and listens to the topic.
Fetches one message
Does required actions to that message
Closes processorClient connection.
Repeat
I am using the ServiceBusProcessorClient class as shown in the following documentation Receive messages from a subscription
Is there a way to utilize this code in order to only fetch one message from the topic at a time before calling the processorClient.close();
I have tried using setting maxConcurrentCalls and setting prefetchCount(0).

Message getting redelivered to RabbitMQ consumer setup using spring cloud stream

We have a SpringBoot service implementation in which we are using delayed messaging with the below setup:
Initial queue (Queue 1) that gets the message has a TTL set, the queue also has a dead letter exchange mentioned with a specific dead letter routing key.
Another queue (Queue 2) is bound to the DLX of the previous queue with the routing key which is set as the dead letter routing key
A consumer listens to the messages on Queue 2.
The delayed messaging seems to work as expected but I am seeing an issue with messages getting redelivered in certain scenarios.
If I have a debug point in my consumer and keep the message just after reading it for some time then once the current message has been processed consumer gets another message which has the below properties:
Redelivered property as true.
Property deliveryAttempt as 1
Only the first message has an x-death header and redelivered messages do not seem to have it.
The attempt to deliver the message is done 3 times as many times as I pause the consumer using the debug point each time after reading each redelivered message.
My understanding was that the acknowledgment mode by default is AUTO so once the consumer has read the message then it would not be redelivered?
I have tried using maxAttempts=1 property but does not seem to help.
I am using the spring cloud stream to create the consumers and the queues.
I used to run into this issue when the message processing in the consumer failed (exception thrown). In this case, if you have DLQ configured, make sure to add the following configuration as well so the failed message will be routed to the DLQ not the original listening queue.
"
rabbit:
autoBindDlq: true
"
Otherwise if you don't set up the DLQ, configure "autoBindDlq" to "false".

Google PubSub Java (Scala) Client Gets Excessive Amount of Resent Messages

I have a scenario where I load a subscription with around 1100 messages. I then start a Spark job which pulls messages from this subscription with these settings:
MaxOutstandingElementCount: 5
MaxAckExtensionPeriod: 60 min
AckDeadlineSeconds: 600
The first message to get processed starts a cache generation which takes about 30 minutes to complete. Any other messages arriving while this is going on are simply "returned" with no ack or nack. After that, a given message takes between 1 min and 30 mins to process. With an ack extension period of 60 min, I would never expect to see resending of messages.
The behaviour I am seeing is that while the initial cache is being generated, every 10 minutes 5 new messages are grabbed by the client and returned with no ack or nack by my code. This is unexpected. I would expect the deadline of the original 5 messages to be extended up to an hour.
Furthermore, after having processed and acked about 500 of the messages, I would expect around 600 left in the subscription, but I see almost the original 1100. These turn out to be resent duplicates, as I log these in my code. This is also very unexpected.
This is a screenshot from google console after around 500 messages have been processed and acked (ignore the first "hump", that was an aborted test run):
Am I missing something?
Here is the setup code:
val name = ProjectSubscriptionName.of(ConfigurationValues.ProjectId,
ConfigurationValues.PubSubSubscription)
val topic = ProjectTopicName.of(ConfigurationValues.ProjectId,
ConfigurationValues.PubSubSubscriptionTopic)
val pushConfig = PushConfig.newBuilder.build
val ackDeadlineSeconds = 600
subscriptionAdminClient.createSubscription(
name,
topic,
pushConfig,
ackDeadlineSeconds)
val flowControlSettings = FlowControlSettings.newBuilder()
.setMaxOutstandingElementCount(5L)
.build();
// create a subscriber bound to the asynchronous message receiver
val subscriber = Subscriber
.newBuilder(subscriptionName, new EtlMessageReceiver(spark))
.setFlowControlSettings(flowControlSettings)
.setMaxAckExtensionPeriod(Duration.ofMinutes(60))
.build
subscriber.startAsync.awaitRunning()
Here is the code in the receiver which runs when a message arrives while the cache is being generated:
if(!BIQConnector.cacheGenerationDone){
Utilities.logLine(
s"PubSub message for work item $uniqueWorkItemId ignored as cache is still being generated.")
return
}
And finally when a message has been processed:
consumer.ack()
Utilities.logLine(s"PubSub message ${message.getMessageId} for $tableName acknowledged.")
// Write back to ETL Manager
Utilities.logLine(
s"Writing result message back to topic ${etlResultTopic} for table $tableName, $tableDetailsForLog.")
sendPubSubResult(importTableName, validTableName, importTimestamp, 2, etlResultTopic, stageJobData,
tableDetailsForLog, "Success", isDeleted)
Is your Spark job using a Pub/Sub client library to pull messages? These libraries should indeed keep extending your message deadlines up to the MaxAckExtensionPeriod you specified.
If your job is using a Pub/Sub client library, this is unexpected behavior. You should contact Google Cloud support with your project name, subscription name, client library version, and a sample of the message IDs from the messages you are "returning" without acking. They will be able to investigate further into why you're receiving these resent messages.

How to progress through event gateway using Activiti REST API

I have a process where, at some point, two different kind of message can occurs, and if none appears after a time, the workflow goes timeout.
Based on the documentation, I have modelised the process using a event gateway :
To progress my activiti workflow, I am using activiti REST API. However, I cannot find in the documentation how to send a message to the gateway in order to continue to either Message 1 or Message 2. I tried triggering message to all execution IDs linked to my process ID but to no avail.
What is the right REST API command to progress in this workflow ?
Thanks for your support.
Edit 1 :
It seems that the Event Gateway is subbed to only one event.
It react to :
POST http://localhost:8082/activiti-rest/service/runtime/executions/20178
{"action":"messageEventReceived","messageName":"Message 1"}
and continue the process for the Message 1. However, with Message 2 defined exactly the same (but with another message), it returns the not found subscription error :
Execution with id '20178' does not have a subscription to a message event with name 'Message 2'"
For an event gateway (https://www.activiti.org/userguide/#bpmnEventbasedGateway). The intermediate message/signal catching events are mutually exclusive. It will follow only one path from the gateway depending on which message is received. In your case, you have fired message 1 already, so the execution continues on message 1 path and other message subscriptions are deleted. Therefore you are getting the error.

Ensure Ordering of Few Messages in given Set of messages

Suppose i have 51 msgs produced by MobileApp .
I want that 51th msg should hit the Application server after all 50 are processed . but i don't need ordering for other 50 msgs.
They can hit in any order(Should be parallel) .
Currently i am using Kafka as message broker .
Major Restriction :- I cannot put any callback mechanism on mobile App to give 51th message only after i received 50th message .
Any ideas around this or links/pointers ??
If you can send (synchronously with commits) the 51 messages in order and use the partitioner to send all messages in a sequence to the same partition (perhaps by making the userid the message's key), you will be assured of reading those messages in order.

Categories