We're using PubSub in prod and seeing a problem that there are more VMs handling PubSub messages that we would expect to have.
I’ve run simple tests using PubSub overnight and it appears that something goes not so smooth as we've expected with the rate limiting mechanism.
Here is the test:
Publish some amount of messages into a topic with Pull Subscription.
In the experiment, there are about 2,7k messages (started approx at 9pm)
Configure one async client using the StreamingPull connection and FlowControl set to 2.
Simulate that handling of every incoming message takes 5 seconds via moving the execution into a timer and acknowledging the message
only when the timer finishes.
Expected results:
Messages from PubSub are consumed with the same speed, getting 2 messages at a time every 5 seconds. A small timeout between acking a message and a new message pulled due to all the network and processing expenses is expected.
Actual result: PubSub starts throttling, or something like this, with a huge timeout. No message arrives at that time. The timeout depends on amount of unacked messages in subscription.
It doesn't seem clear from the FlowControl docs.
Here is the code of consumer (client):
var concurrentFlowsNumber = config.getLong(CONFIG_NUMBER_OF_THREADS);
var flowSettings = FlowControlSettings.newBuilder()
.setMaxOutstandingElementCount(concurrentFlowsNumber)
.setLimitExceededBehavior(FlowController.LimitExceededBehavior.Block)
.build();
var subscriber = Subscriber.newBuilder(subscriptionName, receiver)
.setCredentialsProvider(() -> serviceAccountCredentials)
.setFlowControlSettings(flowSettings)
.build();
subscriber.addListener(
new Subscriber.Listener() {
#Override
public void failed(ApiService.State from, Throwable failure) {
logger.error(failure);
}
},
MoreExecutors.directExecutor());
var apiService = subscriber.startAsync();
apiService.addListener(new ApiService.Listener() {
#Override
public void running() {
logger.info("Pubsub started");
}
#Override
public void failed(ApiService.State from, Throwable failure) {
logger.error("Pubsub failed on step: {}", from);
}
}, Runnable::run);
And the message handler is:
private static void handlePubSubMessage(PubsubMessage message, AckReplyConsumer consumer) {
new Timer().schedule(new TimerTask() {
#Override
public void run() {
consumer.ack();
}
}, (long) 3000 + rand.nextInt(5000));
}
So, does anyone have any idea how to make the clients (many vms) consume messages with concurrent handling limitations (up to 4 concurrent messages) without breaking for timeouts?
P.s. These questions are similar, but not the same:
Google pubsub flow control
pubsub Dynamic rate limiting
Cloud pubsub slow poll rate
Since you have a backlog build up, you might be running into this issue: https://cloud.google.com/pubsub/docs/pull#streamingpull_dealing_with_large_backlogs_of_small_messages
Your undelivered messages will get buffered between the Pub/Sub service and the client library. Messages might get stuck in a single client's buffer, or get redelivered to the same client if the ackDeadline was exceeded.
You can experiment with using the synchronous pull as suggested.
Related
I am using async Pull to pull messages from a pupSub topic, do some processing and send messages to ActiveMQ topic.
With the current configuration of pupSub I have to ack() the messages upon recieval. This however, does not suit my use case, as I need to ONLY ack() messages after they are successfully processed and sent to the other Topic. this means (per my understanding) ack()ing the messages outside the messageReciver.
I tried to save the each message and its AckReplyConsumer to be able to call it later and ack() the messages, this however does not work as expected. and not all messages are correctly ack() ed.
So I want to know if this is possible at all. and if Yes how
my subscriber configs
public Subscriber getSubscriber(CompositeConfigurationElement compositeConfigurationElement, Queue<CustomPupSubMessage> messages) throws IOException {
ProjectSubscriptionName subscriptionName = ProjectSubscriptionName.of(compositeConfigurationElement.getPubsub().getProjectid(),
compositeConfigurationElement.getSubscriber().getSubscriptionId());
ExecutorProvider executorProvider =
InstantiatingExecutorProvider.newBuilder().setExecutorThreadCount(2).build();
// Instantiate an asynchronous message receiver.
MessageReceiver receiver =
(PubsubMessage message, AckReplyConsumer consumer) -> {
messages.add(CustomPupSubMessage.builder().message(message).consumer(consumer).build());
};
// The subscriber will pause the message stream and stop receiving more messages from the
// server if any one of the conditions is met.
FlowControlSettings flowControlSettings =
FlowControlSettings.newBuilder()
// 1,000 outstanding messages. Must be >0. It controls the maximum number of messages
// the subscriber receives before pausing the message stream.
.setMaxOutstandingElementCount(compositeConfigurationElement.getSubscriber().getOutstandingElementCount())
// 100 MiB. Must be >0. It controls the maximum size of messages the subscriber
// receives before pausing the message stream.
.setMaxOutstandingRequestBytes(100L * 1024L * 1024L)
.build();
//read credentials
InputStream input = new FileInputStream(compositeConfigurationElement.getPubsub().getSecret());
CredentialsProvider credentialsProvider = FixedCredentialsProvider.create(ServiceAccountCredentials.fromStream(input));
Subscriber subscriber = Subscriber.newBuilder(subscriptionName, receiver)
.setParallelPullCount(compositeConfigurationElement.getSubscriber().getSubscriptionParallelThreads())
.setFlowControlSettings(flowControlSettings)
.setCredentialsProvider(credentialsProvider)
.setExecutorProvider(executorProvider)
.build();
return subscriber;
}
my processing part
jmsConnection.start();
for (int i = 0; i < patchSize; i++) {
var message = messages.poll();
if (message != null) {
byte[] payload = message.getMessage().getData().toByteArray();
jmsMessage = jmsSession.createBytesMessage();
jmsMessage.writeBytes(payload);
jmsMessage.setJMSMessageID(message.getMessage().getMessageId());
producer.send(jmsMessage);
list.add(message.getConsumer());
} else break;
}
jmsSession.commit();
jmsSession.close();
jmsConnection.close();
// if upload is successful then ack the messages
log.info("sent " + list.size() + " in direction " + dest);
list.forEach(consumer -> consumer.ack());
There is nothing that requires messages to be acked within the MessageReceiver callback and you should be able to acknowledge messages asynchronously. There are a few things to keep in mind and look for:
Check to ensure that you are calling ack before the ack deadline expires. By default, the Java client library does extend the ack deadline for up to 1 hour, so if you are taking less time than that to process, you should be okay.
If your subscriber is often flow controlled, consider reducing the value you pass into setParallelPullCount to 1. The flow control settings you pass in are passed to each stream, not divided among them, so if each stream is able to receive the full value passed in and your processing is slow enough, you could be exceeding the 1-hour deadline in the client library without having even received the message yet, causing the duplicate delivery. You really only need to use setParallelPullCount to a larger value if you are able to process messages much faster than a single stream can deliver them.
Ensure that your client library version is at least 1.109.0. There were some improvements made to the way flow control was done in that version.
Note that Pub/Sub has at-least-once delivery semantics, meaning messages can be redelivered, even if ack is called properly. Note that not acknowledging or nacking a single message could result in the redelivery of all messages that were published together in a single batch. See the "Message Redelivery & Duplication Rate
" section of "Fine-tuning Pub/Sub performance with batch and flow control settings."
If all of that still doesn't fix the issue, then it would be best to try to create a small, self-contained example that reproduces the issue and open up a bug in the GitHub repo.
I'm using Spring boot 2.2.0.RELEASE and when I send data to AmazonSqs I have delay 300-500ms. Maybe I do something wrong. My code looks like the following:
public class MySender {
private final QueueMessagingTemplate queueMessagingTemplate;
private AmazonSQSAsync amazonSqsAsync;
public MySender(AmazonSQSAsync amazonSqsAsync) {
this.amazonSqsAsync = amazonSqsAsync;
this.queueMessagingTemplate = new QueueMessagingTemplate(amazonSqs);
}
public void send(String queue, String msg) {
...
Message<String> message = MessageBuilder.withPayload(msg)
.setHeader("MyHeader", "val");
long startSending = System.currentTimeMillis();
queueMessagingTemplate.send(queue, message);
System.out.println("Sending time: " + (System.currentTimeMillis() - startSending));
// and this I get sending message time 300 - 500ms.
}
}
How can I reduce this delay?
Amazon SQS queues can deliver very high throughput. If you are not using FIFO queue, then there is no limit for number of messages. The delay in sending messages to SQS queue from your spring boot application depends on number of other factors like region, network bandwidth
If you are running your spring boot application in your local machine in India and your SQS queue region is in Europe for example, then there will be a latency.
To improve performance, try deploying your spring boot application in EC2 instance in the same region of your SQS queue.
You can also try using vpc endpoint to connect to SQS queue which will use aws private network to send messages and it will give you good performance
I would suggest the better way to use lambda to send message to SQS queue which can give you maximum performance.
Also validate the configuration of your SQS queue.Check the values set for Default visibility timeout, Receive message wait time, Delivery delay
I'm using the Google Pub/Sub Java SDK to subscribe to a topic. What I want to do is the following:
Start listening to a topic for X seconds (let's assume 25 seconds)
If a message is received then stop listening and process the message (this can take a few minutes)
After processing the message continue listening for a topic again for 25 seconds
If no message is received within 25 seconds then stop definitively listening
I can't seem to find anything in the documentation and only. Maybe it's just not possible?
Here's how I start the subscriber:
// Create a subscriber bound to the asynchronous message receiver
subscriber = Subscriber.newBuilder(projectSubscriptionName, new PubSubRoeMessageReceiver()).build();
// Start subscriber
subscriber.startAsync().awaitRunning();
// Allow the subscriber to run indefinitely unless an unrecoverable error occurs.
subscriber.awaitTerminated();
And this is what my message receiver looks like:
public class PubSubRoeMessageReceiver implements MessageReceiver {
#Override
public void receiveMessage(PubsubMessage pubsubMessage, AckReplyConsumer ackReplyConsumer) {
// Acknowledge message
System.out.println("Acknowledge message");
ackReplyConsumer.ack();
// TODO: stop the subscriber
// TODO: run task X
// TODO: start the subscriber
}
}
Any ideas?
Using Cloud Pub/Sub in this way is an anti-pattern and would cause issues. If you immediately ack the message after you receive it, but before you process it, what do you do if the subscriber crashes for some reason? Pub/Sub won't redeliver the message and therefore may never process it potentially.
Therefore, you probably want to wait to ack until after the message is processed. But then, you wouldn't be able to shut down the subscriber because the fact that the message is outstanding would be lost and therefore, the ack deadline would expire and the message would get redelivered.
If you want to ensure the client only receives one message at a time, you could use the FlowControlSettings on the client. If you set MaxOutstandingElementCount to 1, then only one message will be delivered to receiveMessage at a time:
subscriber = Subscriber.newBuilder(projectSubscriptionName, new PubSubRoeMessageReceiver())
.setFlowControlSettings(FlowControlSettings.newBuilder()
.setMaxOutstandingRequestBytes(10L * 1024L * 1024L) // 10MB messages allowed.
.setMaxOutstandingElementCount(1L) // Only 1 outstanding message at a time.
.build())
.build();
Keep in mind that if you have a large backlog of small messages at the time you start up the subscriber and you intend to start up multiple subscribers, you may run into inefficient load balancing as explained in the documentation.
I'm publishing Pubsub messages from AppEngine Flexible environment with the JAVA client library like this:
Publisher publisher = Publisher
.newBuilder(ProjectTopicName.of(Utils.getApplicationId(), "test-topic"))
.setBatchingSettings(
BatchingSettings.newBuilder()
.setIsEnabled(false)
.build())
.build();
publisher.publish(PubsubMessage.newBuilder()
.setData(ByteString.copyFromUtf8(message))
.putAttributes("timestamp", String.valueOf(System.currentTimeMillis()))
.build());
I'm subscribing to the topic in Dataflow and logging how long it takes for the message to reach Dataflow from AppEngine flexible
pipeline
.apply(PubsubIO.readMessagesWithAttributes().fromSubscription(Utils.buildPubsubSubscription(Constants.PROJECT_NAME, "test-topic")))
.apply(ParDo.of(new DoFn<PubsubMessage, PubsubMessage>() {
#ProcessElement
public void processElement(ProcessContext c) {
long timestamp = System.currentTimeMillis() - Long.parseLong(c.element().getAttribute("timestamp"));
System.out.println("Time: " + timestamp);
}
}));
pipeline.run();
When I'm publishing messages at the rate of a few messages per second then the logs show that the time needed for the message to reach Dataflow is between 100ms and 1.5 seconds.
But when the rate is about 100 messages per second then the time is constantly between 100ms - 200ms, which seems totally adequate.
Can someone explain this behavior. It seems as turning off the publisher batching does not work.
Pub/Sub is designed for high throughput messages for both Subscription cases.
Pull subscription works best when there's large volume of messages, it's the kind of subscription you would use when throughput of message processing is your priority. Specially note that synchronous pull doesn't handle messages as soon as they are published, and can choose to pull and handle a fixed number of messages (more messages, more pulls). A better option would be to use asynchronous pull, which uses a long running message listener and acknowledges one message at a time [1].
On the other hand, Push subscription uses a slow-start algorithm: The number of messages sent is doubled with each successful delivery until it reaches its constraints (more messages, more -and faster- deliveries).
[1] https://cloud.google.com/pubsub/docs/pull#asynchronous-pull
I have a simple MQTT listener that subscribes to a topic and call back
MqttClient client = new MqttClient(mqttHost, MqttClient.generateClientId());
client.connect();
client.subscribe("test", QUALITY_OF_SERVICE_2, new IMqttMessageListener() {
public void messageArrived(final String s, final MqttMessage mqttMessage) {
System.out.println("Received"+mqttMessage.toString());
// Code that blocks the thread
lock.lock();
//do something
lock.unlock();
});
Lets say i am publishing 1000 messages to the topic test but running the above listener on tomcat would display < 1000 console outputs showing that the receiver thread is not getting all the sent messages.
Without the lock() code, the listener works as expected and receives all messages.
You should not be doing long running/blocking tasks in the messageArrived handler, as this is called on the main network loop of the client.
If you have long running/blocking tasks to do with a message you should create a local queue and process the messages from that queue with either a single local thread if message order is important, or a pool of threads if you want to handle the incoming messages as quickly as possible.
Java has a built in set of core classes for building queues and starting threads to consume messages from those queues. Look at the classes in the java.util.concurrent package.