Ensure Ordering of Few Messages in given Set of messages - java

Suppose i have 51 msgs produced by MobileApp .
I want that 51th msg should hit the Application server after all 50 are processed . but i don't need ordering for other 50 msgs.
They can hit in any order(Should be parallel) .
Currently i am using Kafka as message broker .
Major Restriction :- I cannot put any callback mechanism on mobile App to give 51th message only after i received 50th message .
Any ideas around this or links/pointers ??

If you can send (synchronously with commits) the 51 messages in order and use the partitioner to send all messages in a sequence to the same partition (perhaps by making the userid the message's key), you will be assured of reading those messages in order.

Related

Avoid Duplicate message being sent to KAFKA in case of Application Crash

This rest end point is called every 1 hour.How can i avoid duplicate data being sent to Kafka?
http://localhost:8080/sendData
Rest end point call does the below 4 operation
Calls Procedure to Get data (lets say 100 messages are returned
by Procedure)
70 messages sent to Kafka .Suddenly my application
crashes
Then to send rest of the 30 messages .I call Procedure
again which again returns 100 messages .
so the 70 messages are
duplicated . //need to avoid the
1 solution to have a have flag(eg IS_SENT_TO_KAFKA) in database for each message(row) .
And update this flag to IS_SENT_TO_KAFKA='Y' is message was delived to KAFKA and IS_SENT_TO_KAFKA='N' if message was not sent to KAFKA
But the above solution has also one flaw .What if application crashes while updating the FLAG.Then also we would have 1 duplicate message
kafkaTemplate.send(topicName, message);
callback onSuccess() --- saveInDB ('Y') ;IS_SENT_TO_KAFKA='Y'
callback onFailure() --- saveInDB ('N') ;IS_SENT_TO_KAFKA='N'

Google PubSub Java (Scala) Client Gets Excessive Amount of Resent Messages

I have a scenario where I load a subscription with around 1100 messages. I then start a Spark job which pulls messages from this subscription with these settings:
MaxOutstandingElementCount: 5
MaxAckExtensionPeriod: 60 min
AckDeadlineSeconds: 600
The first message to get processed starts a cache generation which takes about 30 minutes to complete. Any other messages arriving while this is going on are simply "returned" with no ack or nack. After that, a given message takes between 1 min and 30 mins to process. With an ack extension period of 60 min, I would never expect to see resending of messages.
The behaviour I am seeing is that while the initial cache is being generated, every 10 minutes 5 new messages are grabbed by the client and returned with no ack or nack by my code. This is unexpected. I would expect the deadline of the original 5 messages to be extended up to an hour.
Furthermore, after having processed and acked about 500 of the messages, I would expect around 600 left in the subscription, but I see almost the original 1100. These turn out to be resent duplicates, as I log these in my code. This is also very unexpected.
This is a screenshot from google console after around 500 messages have been processed and acked (ignore the first "hump", that was an aborted test run):
Am I missing something?
Here is the setup code:
val name = ProjectSubscriptionName.of(ConfigurationValues.ProjectId,
ConfigurationValues.PubSubSubscription)
val topic = ProjectTopicName.of(ConfigurationValues.ProjectId,
ConfigurationValues.PubSubSubscriptionTopic)
val pushConfig = PushConfig.newBuilder.build
val ackDeadlineSeconds = 600
subscriptionAdminClient.createSubscription(
name,
topic,
pushConfig,
ackDeadlineSeconds)
val flowControlSettings = FlowControlSettings.newBuilder()
.setMaxOutstandingElementCount(5L)
.build();
// create a subscriber bound to the asynchronous message receiver
val subscriber = Subscriber
.newBuilder(subscriptionName, new EtlMessageReceiver(spark))
.setFlowControlSettings(flowControlSettings)
.setMaxAckExtensionPeriod(Duration.ofMinutes(60))
.build
subscriber.startAsync.awaitRunning()
Here is the code in the receiver which runs when a message arrives while the cache is being generated:
if(!BIQConnector.cacheGenerationDone){
Utilities.logLine(
s"PubSub message for work item $uniqueWorkItemId ignored as cache is still being generated.")
return
}
And finally when a message has been processed:
consumer.ack()
Utilities.logLine(s"PubSub message ${message.getMessageId} for $tableName acknowledged.")
// Write back to ETL Manager
Utilities.logLine(
s"Writing result message back to topic ${etlResultTopic} for table $tableName, $tableDetailsForLog.")
sendPubSubResult(importTableName, validTableName, importTimestamp, 2, etlResultTopic, stageJobData,
tableDetailsForLog, "Success", isDeleted)
Is your Spark job using a Pub/Sub client library to pull messages? These libraries should indeed keep extending your message deadlines up to the MaxAckExtensionPeriod you specified.
If your job is using a Pub/Sub client library, this is unexpected behavior. You should contact Google Cloud support with your project name, subscription name, client library version, and a sample of the message IDs from the messages you are "returning" without acking. They will be able to investigate further into why you're receiving these resent messages.

Get messages that was issued before MQTT subscriber was started [duplicate]

I'm new in MQTT there is a simple range of numbers which I want to print I have created 2 files in which the 1st file whose send data to the 2nd file and the script is like that:
sender.py
import paho.mqtt.client as mqtt
client = mqtt.Client()
client.connect("192.168.1.169", 1883, 60)
for i in range(1,100):
client.publish("TestTopic", i)
print(i)
client.disconnect()
receiver.py:
import paho.mqtt.client as mqtt
def on_connect(client, userdata, flags, rc):
print("Connected with result code "+str(rc))
client.subscribe("house/bulbs/bulb1")
def on_message(client, userdata, msg):
# print(msg.topic+" "+str(msg.payload))
print("message received ", str(msg.payload.decode("utf-8")))
print("message topic=", msg.topic)
print("message qos=", msg.qos)
print("message retain flag=", msg.retain)
client = mqtt.Client()
client.on_connect = on_connect
client.on_message = on_message
client.connect("192.168.1.169", 1883, 60)
client.loop_forever()
I'm able to print the data if the receiver file is active but I have a problem in printing it if I started the sender file and then I started the receiver file ,main question is does MQTT follows the queueing Mechanism or not if yes then ....if I'm running the sender file then its all data should be in queue and after that when I'm run the other file which is receiver then I should get printed.. but its not working in the same way please help me I went lots of documents but i'm able to find any relevant info.. recently I found clean_session if someone have knowledge about this please tell me ....have any questions related my code or anything please let me know
thanks
MQTT is a pub/sub protocol, not a message queuing system.
This means under normal circumstances if there is no subscriber running when a message is published then it will not be delivered.
It is possible to get the broker to queue messages for a specific subscriber, but this requires the subscriber to have been connected before the message is published and to have subscribed with a QOS of greater than 0. Then as long as it reconnects with the clean session flag set to false and the same client id after the publish then the broker will deliver the missed messages.
Retained messages are something different. If a message is published with the retained flag set to true then the broker will deliver this single message to every subscriber when they subscribe to the matching topic. There can only ever be 1 retained message for a given topic.

COD received has correlationID and body as null

We are sending a message to Websphere MQ Queue. While sending the message we are setting the REPLY TO QUEUE NAME and JMSCOrrelationID. We also set the USER IDENTIFIER. The code snippet is as follows.
Message msg = session.createTextMessage((String) message);
Destination codeDestination = session.createQueue("queue://" + replyToQueueMgr + "/" + replyToQueueName);
msg.setJMSReplyTo(codeDestination);
msg.setIntProperty(JmsConstants.JMS_IBM_REPORT_COD, MQC.MQRO_COD);
msg.setJMSCorrelationID(msgCorrelId);
msg.setStringProperty(JmsConstants.JMS_IBM_MQMD_USERIDENTIFIER, "abc");
producer.send(msg);
Please note that we have ensure that all the fields which we set are not null. Also the user abc is a valid user because if not then the CODs should go to DEAD LETTER QUEUE but there are no messages in it. Still after the message is picked up we get a COD which has JMSCorrelationID as null. In COD Processor we are listening on the replyToQueuename.
String correlationID = (String)eventContext.getMessage().getInboundProperty("JMSCorrelationID");
On checking above correlation ID is null. ALso the message payload is of {null_payload} type of NullPayload class of mule. I know that body would be NULL because we set MQC.MQRO_COD. But I dont undertand how the correlationId got wiped out.
Please advise if there is any configuration at the Webview MQ end whih could cause such a behaviour? Or is there something missing in the way we are setting the header properties?
UPDATE
The queue that we are sending the message to with COD information is an alias to a TOPIC. There are 2 subscribers to this TOPIC and we observed that there were instances where we received multiple COD's when both the subscribers picked up the messages. Is there any way to ensure that the TOPIC sends a single COD after all the subscribers have picked up the message? Could this QM setup be the cause of the COD with null?
User Identifier
When a message is published, each subscriber gets a copy of the message with a unique message ID and with the identity context fields in the MQMD (UserID, AccountingToken, ApplIdentityData) all set to the subscriber's context. So it doesn't matter what you set in the MQMD UserID of the message you publish, all the copies will have the subscriber user ID in them. This user ID will, by definition, exist where he subscriber is so the CODs will be able to be put.
Correlation ID with Pub/Sub
You can ensure the correlation ID from the publisher is sent all the way through to the subscriber, by ensuring the subscription is made using MQSO_SET_CORREL_ID and setting the MQSO SubCorrelId to MQCI_NONE.
One COD for multiple messages
Since there are multiple independent messages, each with the COD Report Option set, you will get multiple report messages. There is no setting to combine these, however you could write an intermediate application to combine then if your main application wants only one.
Passing Correl ID back in a Report Message
By default the report option will send back the message ID in the Correl ID of the report message. If you want to have the Correl ID passed back, you should use MQRO_PASS_CORREL_ID.
Further Reading
Report Option
The issue is not with MQ configuration but with Mule endpoint configuration. The COD which were sent with nullpayload were actually sent by my own mule application jmsRepyToHandler. There is some default configuration in Mule which seems to be causing this behavior.
Analysis
Application sends a message to the Queue which is alias to a TOPIC with two subscribers
Once both the subscribers consume the message we get 2 COD as expected.
These COD are consumed by my MULE application and after processing the MULE application again sends COD to same queue with null correlation Id.
UPDATE: Mule Fix to avoid default ReplyTo
For the fix you need to override getReplyToHandler method of Mule JMSConnector as follows
if (disableReplyTo) {
return new DisableJmsReplyToHandler(this, getDefaultResponseTransformers(endpoint));
}else {
return super.getReplyToHandler(endpoint);
}
Set the property disableReplyTo as true so that above code provide DisableJmsReplyToHandler instead of the default one.

RabbitMQ - Get total count of messages enqueued

I have a Java client which monitors RabbitMQ queue. I am able to get the count of messages currently in queue with this code
#Resource
RabbitAdmin rabbitAdmin;
..........
DeclareOk declareOk = rabbitAdmin.getRabbitTemplate().execute(new ChannelCallback<DeclareOk>() {
public DeclareOk doInRabbit(Channel channel) throws Exception {
return channel.queueDeclarePassive("test.pending");
}
});
return declareOk.getMessageCount();
I want to get some more additional details like -
Message body of currently enqueued items.
Total number of messages that was enqueued in the queue since the queue was created.
Is there any way to retrieve these data in Java client?
With AMQP protocol (including RabbitMQ implementation) you can't get such info with 100% guarantee.
The closest number to messages count is messages count returned with queue.declare-ok (AMQP.Queue.DeclareOk in java AMQP client library).
Whilst messages count you receive with queue.declare-ok may match exact messages number enqueues, you can't rely on it as it doesn't count messages which waiting acknowledges or published to queue during transaction but not committed yet.
It really depends what kind of precission do you need.
As to enqueued messages body, you may want to manually extract all messages in queue, view their body and put them back to queue. This is the only way to do what you want.
You can get some information about messages count with Management Plugin, RabbitMQ Management HTTP API and rabbitmqctl util (see list_queues, list_channels).
You can't get total published messages count since queue was created and I think nobody implement such stats while it useless (FYI, with messages flow in average 10k per second you will not even reach uint64 in a few thousand years).
AMQP.Queue.DeclareOk dok = channel.queueDeclare(QUEUE_NAME, true, false, false, queueArgs);
dok.getMessageCount();
To access queue details via http api,
http://public-domain-name:15672/api/queues/%2f/queue_name
To access queue details via command from localhost cli promt,
curl -i -u guest_uname:guest_password http://localhost:15672/api/queues/%2f/queue_name
Where,
%2f is default vhost "/"

Categories