Get messages that was issued before MQTT subscriber was started [duplicate] - java

I'm new in MQTT there is a simple range of numbers which I want to print I have created 2 files in which the 1st file whose send data to the 2nd file and the script is like that:
sender.py
import paho.mqtt.client as mqtt
client = mqtt.Client()
client.connect("192.168.1.169", 1883, 60)
for i in range(1,100):
client.publish("TestTopic", i)
print(i)
client.disconnect()
receiver.py:
import paho.mqtt.client as mqtt
def on_connect(client, userdata, flags, rc):
print("Connected with result code "+str(rc))
client.subscribe("house/bulbs/bulb1")
def on_message(client, userdata, msg):
# print(msg.topic+" "+str(msg.payload))
print("message received ", str(msg.payload.decode("utf-8")))
print("message topic=", msg.topic)
print("message qos=", msg.qos)
print("message retain flag=", msg.retain)
client = mqtt.Client()
client.on_connect = on_connect
client.on_message = on_message
client.connect("192.168.1.169", 1883, 60)
client.loop_forever()
I'm able to print the data if the receiver file is active but I have a problem in printing it if I started the sender file and then I started the receiver file ,main question is does MQTT follows the queueing Mechanism or not if yes then ....if I'm running the sender file then its all data should be in queue and after that when I'm run the other file which is receiver then I should get printed.. but its not working in the same way please help me I went lots of documents but i'm able to find any relevant info.. recently I found clean_session if someone have knowledge about this please tell me ....have any questions related my code or anything please let me know
thanks

MQTT is a pub/sub protocol, not a message queuing system.
This means under normal circumstances if there is no subscriber running when a message is published then it will not be delivered.
It is possible to get the broker to queue messages for a specific subscriber, but this requires the subscriber to have been connected before the message is published and to have subscribed with a QOS of greater than 0. Then as long as it reconnects with the clean session flag set to false and the same client id after the publish then the broker will deliver the missed messages.
Retained messages are something different. If a message is published with the retained flag set to true then the broker will deliver this single message to every subscriber when they subscribe to the matching topic. There can only ever be 1 retained message for a given topic.

Related

i.grpc.internal.AbstractClientStream - Received data on closed stream meaning

I have a Spring boot application (v2.2.10.RELEASE) that subscribes to multiple topics in pubSub and pulls async data and sends it to somewhere else. I am not using SpringGCP, just native google libraries
this is my subscriber setting:
// Instantiate an asynchronous message receiver.
MessageReceiver receiver =
(PubsubMessage message, AckReplyConsumer consumer) -> {
messages.add(message);
consumer.ack();
};
Subscriber subscriber = Subscriber.newBuilder(subscriptionName, receiver)
.setParallelPullCount(2)
.setFlowControlSettings(flowControlSettings)
.setCredentialsProvider(credentialsProvider)
.setExecutorProvider(executorProvider)
//.setChannelProvider()
.build();
With high traffic and big messages (2 - 4 kb) I encounter this info message:
[grpc-default-worker-ELG-1-1] INFO i.grpc.internal.AbstractClientStream - Received data on closed stream
first of all, I don't fully understand what that means? all that I noticed was that when this happens the delivered duplicated messages increase. so I assumed it meant that pubSub tried to reach the subscriber with some messages but the subscriber for some reason was not ready so pubSub will try to deliver the messages again. and hence more duplicates, is that right?
would this problem be solved using the TransportChannelProvider in subscribers? my understanding of the poorly written documentation, that this will create a new channel for delivery when the current in-use channel is closed, hence get rid of the previous log message.
if yes, how do I define the channel target string? and where can I find A NameResolver-compliant URI for the mangagedChannel. the snippet I mean is this:
private TransportChannelProvider getChannelProvider() {
ManagedChannel channel = ManagedChannelBuilder.forTarget(target).usePlaintext(true).build();
return FixedTransportChannelProvider.create(GrpcTransportChannel.create(channel));
}
I am pretty new to GCP so sorry if my question is not coherent enough
Using a custom TransportChannelProvider won't solve this type of issue. This is more likely an issue deeper down in the stack, e.g., at the gRPC level. There have been some open issues for this type of error [1, 2].
With regard to why it is causing duplicates, it is possible that the messages are getting delivered via a stream that is already closed (which aligns with the error message) because they were trapped in a lower-level buffer at the gRPC layer and therefore ended up being duplicates of messages that were subsequently delivered and processed via another stream. This could be a version of the issue discussed in the documentation around large backlogs of small messages. There was a fix for this issue in v1.109.0 of the Java client library, so if you are using a version older than that, it is worth updating.
If duplicates continue to be an issue, it would be best to reach out to support with the name of your subscription and the message IDs of some of the duplicate messages so that they can look at the delivery patterns for those messages and further diagnose if these redeliveries are unexpected.

Google PubSub Java (Scala) Client Gets Excessive Amount of Resent Messages

I have a scenario where I load a subscription with around 1100 messages. I then start a Spark job which pulls messages from this subscription with these settings:
MaxOutstandingElementCount: 5
MaxAckExtensionPeriod: 60 min
AckDeadlineSeconds: 600
The first message to get processed starts a cache generation which takes about 30 minutes to complete. Any other messages arriving while this is going on are simply "returned" with no ack or nack. After that, a given message takes between 1 min and 30 mins to process. With an ack extension period of 60 min, I would never expect to see resending of messages.
The behaviour I am seeing is that while the initial cache is being generated, every 10 minutes 5 new messages are grabbed by the client and returned with no ack or nack by my code. This is unexpected. I would expect the deadline of the original 5 messages to be extended up to an hour.
Furthermore, after having processed and acked about 500 of the messages, I would expect around 600 left in the subscription, but I see almost the original 1100. These turn out to be resent duplicates, as I log these in my code. This is also very unexpected.
This is a screenshot from google console after around 500 messages have been processed and acked (ignore the first "hump", that was an aborted test run):
Am I missing something?
Here is the setup code:
val name = ProjectSubscriptionName.of(ConfigurationValues.ProjectId,
ConfigurationValues.PubSubSubscription)
val topic = ProjectTopicName.of(ConfigurationValues.ProjectId,
ConfigurationValues.PubSubSubscriptionTopic)
val pushConfig = PushConfig.newBuilder.build
val ackDeadlineSeconds = 600
subscriptionAdminClient.createSubscription(
name,
topic,
pushConfig,
ackDeadlineSeconds)
val flowControlSettings = FlowControlSettings.newBuilder()
.setMaxOutstandingElementCount(5L)
.build();
// create a subscriber bound to the asynchronous message receiver
val subscriber = Subscriber
.newBuilder(subscriptionName, new EtlMessageReceiver(spark))
.setFlowControlSettings(flowControlSettings)
.setMaxAckExtensionPeriod(Duration.ofMinutes(60))
.build
subscriber.startAsync.awaitRunning()
Here is the code in the receiver which runs when a message arrives while the cache is being generated:
if(!BIQConnector.cacheGenerationDone){
Utilities.logLine(
s"PubSub message for work item $uniqueWorkItemId ignored as cache is still being generated.")
return
}
And finally when a message has been processed:
consumer.ack()
Utilities.logLine(s"PubSub message ${message.getMessageId} for $tableName acknowledged.")
// Write back to ETL Manager
Utilities.logLine(
s"Writing result message back to topic ${etlResultTopic} for table $tableName, $tableDetailsForLog.")
sendPubSubResult(importTableName, validTableName, importTimestamp, 2, etlResultTopic, stageJobData,
tableDetailsForLog, "Success", isDeleted)
Is your Spark job using a Pub/Sub client library to pull messages? These libraries should indeed keep extending your message deadlines up to the MaxAckExtensionPeriod you specified.
If your job is using a Pub/Sub client library, this is unexpected behavior. You should contact Google Cloud support with your project name, subscription name, client library version, and a sample of the message IDs from the messages you are "returning" without acking. They will be able to investigate further into why you're receiving these resent messages.

Mqtt Client: get Retained Message after Subscribing

I am using the latest Paho version via Maven.
<dependency>
<groupId>org.eclipse.paho</groupId>
<artifactId>org.eclipse.paho.client.mqttv3</artifactId>
<version>1.2.2</version>
</dependency>
I create client using
MqttClient client = new MqttClient("tcp://localhost", MqttClient.generateClientId());
MqttConnectOptions options = new MqttConnectOptions();
options.setMaxInflight(1000);
options.setAutomaticReconnect(true);
Then I subscribe to a topic as follows:
client.setCallback(new Callback());
client.connect();
client.subscribe(topic);
Another mqtt client publishes a message on that topic with
MqttMessage message = new MqttMessage(byteStream);
message.setRetained(true);
With the retain flag I would expect that as soon as I subscribe, my callback is invoked. Unfortunately, the subscription callback is NOT called if the message is sent before the subscription is executed.
How do I get the retained value?
I think you are using QOS=0.
It is possible a retained message not saved with QOS=0 and retained_flag=true.
More details:
Reference link:
SECTION (3.3.1.3 RETAIN):
If the RETAIN flag is set to 1, in a PUBLISH Packet sent by a Client to a Server, the Server MUST store the Application Message and its QoS, so that it can be delivered to future subscribers whose subscriptions match its topic name [MQTT-3.3.1-5]. When a new subscription is established, the last retained message, if any, on each matching topic name MUST be sent to the subscriber [MQTT-3.3.1-6].
If the Server receives a QoS 0 message with the RETAIN flag set to 1 it MUST discard any message previously retained for that topic. It SHOULD store the new QoS 0 message as the new retained message for that topic, but MAY choose to discard it at any time - if this happens there will be no retained message for that topic [MQTT-3.3.1-7]. See Section 4.1 for more information on storing state.
Summary:
You can use QOS>0 to solve your problem.
Unfortunately, the subscription callback is NOT called if the message is sent before the subscription is executed. How do I get the retained value?
In this case, the publisher (one client) sends out message, immediately disconnects from MQTT broker (server), then the subscriber (another client) connects to the server with the same topic, Without last will message, it is not possible that the published message will be delivered to your subscriber.
There would be options in paho to enable last will message by setting :
will flag
will retain flag
will topic
will message (will payload)
retain flag in PUBLISH control packet (not the same as will retain flag)
Set up all of them when the publisher sends out the message with a topic, the sent message will be retained on MQTT broker even after the publisher closes network connection. At a later time when any subscriber (another client) connects to the broker with the same topic, the retained message will be sent from the broker to the subscriber.
Also please note that QoS field of PUBLISH control packet is for ensuring delivery is complete (at different level) ONLY between MQTT publisher and MQTT broker(server) , NOT between MQTT publisher and subscriber (the 2 clients).

Making sure a message published on a topic exchange is received by at least one consumer

TLDR; In the context of a topic exchange and queues created on the fly by the consumers, how to have a message redelivered / the producer notified when no consumer consumes the message?
I have the following components:
a main service, producing files. Each file has a certain category (e.g. pictures.profile, pictures.gallery)
a set of workers, consuming files and producing a textual output from them (e.g. the size of the file)
I currently have a single RabbitMQ topic exchange.
The producer sends messages to the exchange with routing_key = file_category.
Each consumer creates a queue and binds the exchange to this queue for a set of routing keys (e.g. pictures.* and videos.trending).
When a consumer has processed a file, it pushes the result in a processing_results queue.
Now - this works properly, but it still has a major issue. Currently, if the publisher sends a message with a routing key that no consumer is bound to, the message will be lost. This is because even if the queue created by the consumers is durable, it is destroyed as soon as the consumer disconnects since it is unique to this consumer.
Consumer code (python):
channel.exchange_declare(exchange=exchange_name, type='topic', durable=True)
result = channel.queue_declare(exclusive = True, durable=True)
queue_name = result.method.queue
topics = [ "pictures.*", "videos.trending" ]
for topic in topics:
channel.queue_bind(exchange=exchange_name, queue=queue_name, routing_key=topic)
channel.basic_consume(my_handler, queue=queue_name)
channel.start_consuming()
Loosing a message in this condition is not acceptable in my use case.
Attempted solution
However, "loosing" a message becomes acceptable if the producer is notified that no consumer received the message (in this case it can just resend it later). I figured out the mandatory field could help, since the specification of AMQP states:
This flag tells the server how to react if the message cannot be routed to a queue. If this flag is set, the server will return an unroutable message with a Return method.
This is indeed working - in the producer, I am able to register a ReturnListener :
rabbitMq.confirmSelect();
rabbitMq.addReturnListener( (int replyCode, String replyText, String exchange, String routingKey, AMQP.BasicProperties properties, byte[] body) -> {
log.info("A message was returned by the broker");
});
rabbitMq.basicPublish(exchangeName, "pictures.profile", true /* mandatory */, MessageProperties.PERSISTENT_TEXT_PLAIN, messageBytes);
This will as expected print A message was returned by the broker if a message is sent with a routing key no consumer is bound to.
Now, I also want to know when the message was correctly received by a consumer. So I tried registering a ConfirmListener as well:
rabbitMq.addConfirmListener(new ConfirmListener() {
void handleAck(long deliveryTag, boolean multiple) throws IOException {
log.info("ACK message {}, multiple = ", deliveryTag, multiple);
}
void handleNack(long deliveryTag, boolean multiple) throws IOException {
log.info("NACK message {}, multiple = ", deliveryTag, multiple);
}
});
The issue here is that the ACK is sent by the broker, not by the consumer itself. So when the producer sends a message with a routing key K:
If a consumer is bound to this routing key, the broker just sends an ACK
Otherwise, the broker sends a basic.return followed by a ACK
Cf the docs:
For unroutable messages, the broker will issue a confirm once the exchange verifies a message won't route to any queue (returns an empty list of queues). If the message is also published as mandatory, the basic.return is sent to the client before basic.ack. The same is true for negative acknowledgements (basic.nack).
So while my problem is theoretically solvable using this, it would make the logic of knowing if a message was correctly consumed very complicated (especially in the context of multi threading, persistence in a database, etc.):
send a message
on receive ACK:
if no basic.return was received for this message
the message was correctly consumed
else
the message wasn't correctly consumed
on receive basic.return
the message wasn't correctly consumed
Possible other solutions
Have a queue for each file category, i.e. the queues pictures_profile, pictures_gallery, etc. Not good since it removes a lot of flexibility for the consumers
Have a "response timeout" logic in the producer. The producer sends a message. It expects an "answer" for this message in the processing_results queue. A solution would be to resend the message if it hasn't been answered to after X seconds. I don't like it though, it would create some additional tricky logic in the producer.
Produce the messages with a TTL of 0, and have the producer listen on a dead-letter exchange. This is the official suggested solution to replace the 'immediate' flag removed in RabbitMQ 3.0 (see paragraph Removal of "immediate" flag). According to the docs of the dead letter exchanges, a dead letter exchange can only be configured per-queue. So it wouldn't work here
[edit] A last solution I see is to have every consumer create a durable queue that isn't destroyed when he disconnects, and have it listen on it. Example: consumer1 creates queue-consumer-1 that is bound to the message of myExchange having a routing key abcd. The issue I foresee is that it implies to find an unique identifier for every consumer application instance (e.g. hostname of the machine it runs on).
I would love to have some inputs on that - thanks!
Related to:
RabbitMQ: persistent message with Topic exchange (not applicable here since queues are created "on the fly")
Make sure the broker holds messages until at least one consumer gets it
RabbitMQ Topic Exchange with persisted queue
[edit] Solution
I ended up implementing something that uses a basic.return, as mentioned earlier. It is actually not so tricky to implement, you just have to make sure that your method producing the messages and the method handling the basic returns are synchronized (or have a shared lock if not in the same class), otherwise you can end up with interleaved execution flows that will mess up your business logic.
I believe that an alternate exchange would be the best fit for your use case for the part regarding the identification of not routed messages.
Whenever an exchange with a configured AE cannot route a message to any queue, it publishes the message to the specified AE instead.
Basically upon creation of the "main" exchange, you configure an alternate exchange for it.
For the referenced alternate exchange, I tend to go with a fanout, then create a queue (notroutedq) binded to it.
This means any message that is not published to at least one of the queues bound to your "main" exchange will end up in the notroutedq
Now regarding your statement:
because even if the queue created by the consumers is durable, it is destroyed as soon as the consumer disconnects since it is unique to this consumer.
Seems that you have configured your queues with auto-delete set to true.
If so, in case of disconnect, as you stated, the queue is destroyed and the messages still present on the queue are lost, case not covered by the alternate exchange configuration.
It's not clear from your use case description whether you'd expect in some cases for a message to end up in more than one queue, seemed more a case of one queue per type of processing expected (while keeping the grouping flexible). If indeed the queue split is related to type of processing, I do not see the benefit of setting the queue with auto-delete, expect maybe not having to do any cleanup maintenance when you want to change the bindings.
Assuming you can go with durable queues, then a dead letter exchange (would again go with fanout) with a binding to a dlq would cover the missing cases.
not routed covered by alternate exchange
correct processing already handled by your processing_result queue
problematic processing or too long to be processed covered by the dead letter exchange, in which case the additional headers added upon dead lettering the message can even help to identify the type of actions to take

RabbitMQ - Get total count of messages enqueued

I have a Java client which monitors RabbitMQ queue. I am able to get the count of messages currently in queue with this code
#Resource
RabbitAdmin rabbitAdmin;
..........
DeclareOk declareOk = rabbitAdmin.getRabbitTemplate().execute(new ChannelCallback<DeclareOk>() {
public DeclareOk doInRabbit(Channel channel) throws Exception {
return channel.queueDeclarePassive("test.pending");
}
});
return declareOk.getMessageCount();
I want to get some more additional details like -
Message body of currently enqueued items.
Total number of messages that was enqueued in the queue since the queue was created.
Is there any way to retrieve these data in Java client?
With AMQP protocol (including RabbitMQ implementation) you can't get such info with 100% guarantee.
The closest number to messages count is messages count returned with queue.declare-ok (AMQP.Queue.DeclareOk in java AMQP client library).
Whilst messages count you receive with queue.declare-ok may match exact messages number enqueues, you can't rely on it as it doesn't count messages which waiting acknowledges or published to queue during transaction but not committed yet.
It really depends what kind of precission do you need.
As to enqueued messages body, you may want to manually extract all messages in queue, view their body and put them back to queue. This is the only way to do what you want.
You can get some information about messages count with Management Plugin, RabbitMQ Management HTTP API and rabbitmqctl util (see list_queues, list_channels).
You can't get total published messages count since queue was created and I think nobody implement such stats while it useless (FYI, with messages flow in average 10k per second you will not even reach uint64 in a few thousand years).
AMQP.Queue.DeclareOk dok = channel.queueDeclare(QUEUE_NAME, true, false, false, queueArgs);
dok.getMessageCount();
To access queue details via http api,
http://public-domain-name:15672/api/queues/%2f/queue_name
To access queue details via command from localhost cli promt,
curl -i -u guest_uname:guest_password http://localhost:15672/api/queues/%2f/queue_name
Where,
%2f is default vhost "/"

Categories