How to increase performance for send message to AWS Queue

How to increase performance for send message to AWS Queue - java

I'm using Spring boot 2.2.0.RELEASE and when I send data to AmazonSqs I have delay 300-500ms. Maybe I do something wrong. My code looks like the following:
public class MySender {
private final QueueMessagingTemplate queueMessagingTemplate;
private AmazonSQSAsync amazonSqsAsync;
public MySender(AmazonSQSAsync amazonSqsAsync) {
this.amazonSqsAsync = amazonSqsAsync;
this.queueMessagingTemplate = new QueueMessagingTemplate(amazonSqs);
}
public void send(String queue, String msg) {
...
Message<String> message = MessageBuilder.withPayload(msg)
.setHeader("MyHeader", "val");
long startSending = System.currentTimeMillis();
queueMessagingTemplate.send(queue, message);
System.out.println("Sending time: " + (System.currentTimeMillis() - startSending));
// and this I get sending message time 300 - 500ms.
}
}
How can I reduce this delay?

Amazon SQS queues can deliver very high throughput. If you are not using FIFO queue, then there is no limit for number of messages. The delay in sending messages to SQS queue from your spring boot application depends on number of other factors like region, network bandwidth
If you are running your spring boot application in your local machine in India and your SQS queue region is in Europe for example, then there will be a latency.
To improve performance, try deploying your spring boot application in EC2 instance in the same region of your SQS queue.
You can also try using vpc endpoint to connect to SQS queue which will use aws private network to send messages and it will give you good performance
I would suggest the better way to use lambda to send message to SQS queue which can give you maximum performance.
Also validate the configuration of your SQS queue.Check the values set for Default visibility timeout, Receive message wait time, Delivery delay

Related

Ack pubSub message outside of the MessageReciever

I am using async Pull to pull messages from a pupSub topic, do some processing and send messages to ActiveMQ topic.
With the current configuration of pupSub I have to ack() the messages upon recieval. This however, does not suit my use case, as I need to ONLY ack() messages after they are successfully processed and sent to the other Topic. this means (per my understanding) ack()ing the messages outside the messageReciver.
I tried to save the each message and its AckReplyConsumer to be able to call it later and ack() the messages, this however does not work as expected. and not all messages are correctly ack() ed.
So I want to know if this is possible at all. and if Yes how
my subscriber configs
public Subscriber getSubscriber(CompositeConfigurationElement compositeConfigurationElement, Queue<CustomPupSubMessage> messages) throws IOException {
ProjectSubscriptionName subscriptionName = ProjectSubscriptionName.of(compositeConfigurationElement.getPubsub().getProjectid(),
compositeConfigurationElement.getSubscriber().getSubscriptionId());
ExecutorProvider executorProvider =
InstantiatingExecutorProvider.newBuilder().setExecutorThreadCount(2).build();
// Instantiate an asynchronous message receiver.
MessageReceiver receiver =
(PubsubMessage message, AckReplyConsumer consumer) -> {
messages.add(CustomPupSubMessage.builder().message(message).consumer(consumer).build());
};
// The subscriber will pause the message stream and stop receiving more messages from the
// server if any one of the conditions is met.
FlowControlSettings flowControlSettings =
FlowControlSettings.newBuilder()
// 1,000 outstanding messages. Must be >0. It controls the maximum number of messages
// the subscriber receives before pausing the message stream.
.setMaxOutstandingElementCount(compositeConfigurationElement.getSubscriber().getOutstandingElementCount())
// 100 MiB. Must be >0. It controls the maximum size of messages the subscriber
// receives before pausing the message stream.
.setMaxOutstandingRequestBytes(100L * 1024L * 1024L)
.build();
//read credentials
InputStream input = new FileInputStream(compositeConfigurationElement.getPubsub().getSecret());
CredentialsProvider credentialsProvider = FixedCredentialsProvider.create(ServiceAccountCredentials.fromStream(input));
Subscriber subscriber = Subscriber.newBuilder(subscriptionName, receiver)
.setParallelPullCount(compositeConfigurationElement.getSubscriber().getSubscriptionParallelThreads())
.setFlowControlSettings(flowControlSettings)
.setCredentialsProvider(credentialsProvider)
.setExecutorProvider(executorProvider)
.build();
return subscriber;
}
my processing part
jmsConnection.start();
for (int i = 0; i < patchSize; i++) {
var message = messages.poll();
if (message != null) {
byte[] payload = message.getMessage().getData().toByteArray();
jmsMessage = jmsSession.createBytesMessage();
jmsMessage.writeBytes(payload);
jmsMessage.setJMSMessageID(message.getMessage().getMessageId());
producer.send(jmsMessage);
list.add(message.getConsumer());
} else break;
}
jmsSession.commit();
jmsSession.close();
jmsConnection.close();
// if upload is successful then ack the messages
log.info("sent " + list.size() + " in direction " + dest);
list.forEach(consumer -> consumer.ack());

There is nothing that requires messages to be acked within the MessageReceiver callback and you should be able to acknowledge messages asynchronously. There are a few things to keep in mind and look for:
Check to ensure that you are calling ack before the ack deadline expires. By default, the Java client library does extend the ack deadline for up to 1 hour, so if you are taking less time than that to process, you should be okay.
If your subscriber is often flow controlled, consider reducing the value you pass into setParallelPullCount to 1. The flow control settings you pass in are passed to each stream, not divided among them, so if each stream is able to receive the full value passed in and your processing is slow enough, you could be exceeding the 1-hour deadline in the client library without having even received the message yet, causing the duplicate delivery. You really only need to use setParallelPullCount to a larger value if you are able to process messages much faster than a single stream can deliver them.
Ensure that your client library version is at least 1.109.0. There were some improvements made to the way flow control was done in that version.
Note that Pub/Sub has at-least-once delivery semantics, meaning messages can be redelivered, even if ack is called properly. Note that not acknowledging or nacking a single message could result in the redelivery of all messages that were published together in a single batch. See the "Message Redelivery & Duplication Rate
" section of "Fine-tuning Pub/Sub performance with batch and flow control settings."
If all of that still doesn't fix the issue, then it would be best to try to create a small, self-contained example that reproduces the issue and open up a bug in the GitHub repo.

Kafka adminClient throws TimeoutException

I have a health thread that checks the state of my Kafka cluster every 5 seconds from my worker application. Every now and then however, I get TimeoutException:
java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Aborted due to timeout.
at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)
I have tools to externally monitor my cluster as well (Cruise Control, Grafana) and none of them points to any problems in the cluster. Also, my worker application is constantly consuming messages and none seem to fail.
Why do I occasionally gets this timeout? If the broker is not down, than I am thinking something in my configs is off. I set the timeout for 5 seconds which seems like more than enough.
My AdminClient configs:
#Bean
public AdminClient adminClient() {
return KafkaAdminClient.create(adminClientConfigs());
}
public Map<String, Object> adminClientConfigs() {
Map<String, Object> props = new HashMap<>();
props.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, serverAddress);
props.put(AdminClientConfig.REQUEST_TIMEOUT_MS_CONFIG, 5000);
return props;
}
How I check the cluster (I than run logic on the broker list):
#Autowired
private AdminClient adminClient;
private void addCluster() throws ExecutionException, InterruptedException {
adminClient.describeCluster().nodes().get().forEach(node -> brokers.add(node.host()));
}

2 things:
The default request timeout is 30secs. By setting it to a smaller value you augment the risk of timeouts for a slow request. If one request out of 1000 (0.1%) takes more than 5 seconds, because you query it every few seconds, you'll see several failures every day.
To investigate why some calls take longer, you can do several things:
Check the Kafka client logs. describeCluster() may require to initiate a new connection to the cluster. In that case, the client will also have to send an ApiVersionsRequest and depending on your config, may establish a TLS connection and/or perform SASL authentication. If any of these happen, it should be clear in the client logs. (You may need to bump the log level a bit to see all these).
Check the broker request metrics. describeCluster() translate into a MetadataRequest sent to a broker. You can track the time requests take to be process. See the metrics described in the docs, in your case, especially: kafka.network:type=RequestMetrics,name=*,request=Metadata

Google PubSub async rate limitation doesn't work as expected

We're using PubSub in prod and seeing a problem that there are more VMs handling PubSub messages that we would expect to have.
I’ve run simple tests using PubSub overnight and it appears that something goes not so smooth as we've expected with the rate limiting mechanism.
Here is the test:
Publish some amount of messages into a topic with Pull Subscription.
In the experiment, there are about 2,7k messages (started approx at 9pm)
Configure one async client using the StreamingPull connection and FlowControl set to 2.
Simulate that handling of every incoming message takes 5 seconds via moving the execution into a timer and acknowledging the message
only when the timer finishes.
Expected results:
Messages from PubSub are consumed with the same speed, getting 2 messages at a time every 5 seconds. A small timeout between acking a message and a new message pulled due to all the network and processing expenses is expected.
Actual result: PubSub starts throttling, or something like this, with a huge timeout. No message arrives at that time. The timeout depends on amount of unacked messages in subscription.
It doesn't seem clear from the FlowControl docs.
Here is the code of consumer (client):
var concurrentFlowsNumber = config.getLong(CONFIG_NUMBER_OF_THREADS);
var flowSettings = FlowControlSettings.newBuilder()
.setMaxOutstandingElementCount(concurrentFlowsNumber)
.setLimitExceededBehavior(FlowController.LimitExceededBehavior.Block)
.build();
var subscriber = Subscriber.newBuilder(subscriptionName, receiver)
.setCredentialsProvider(() -> serviceAccountCredentials)
.setFlowControlSettings(flowSettings)
.build();
subscriber.addListener(
new Subscriber.Listener() {
#Override
public void failed(ApiService.State from, Throwable failure) {
logger.error(failure);
}
},
MoreExecutors.directExecutor());
var apiService = subscriber.startAsync();
apiService.addListener(new ApiService.Listener() {
#Override
public void running() {
logger.info("Pubsub started");
}
#Override
public void failed(ApiService.State from, Throwable failure) {
logger.error("Pubsub failed on step: {}", from);
}
}, Runnable::run);
And the message handler is:
private static void handlePubSubMessage(PubsubMessage message, AckReplyConsumer consumer) {
new Timer().schedule(new TimerTask() {
#Override
public void run() {
consumer.ack();
}
}, (long) 3000 + rand.nextInt(5000));
}
So, does anyone have any idea how to make the clients (many vms) consume messages with concurrent handling limitations (up to 4 concurrent messages) without breaking for timeouts?
P.s. These questions are similar, but not the same:
Google pubsub flow control
pubsub Dynamic rate limiting
Cloud pubsub slow poll rate

Since you have a backlog build up, you might be running into this issue: https://cloud.google.com/pubsub/docs/pull#streamingpull_dealing_with_large_backlogs_of_small_messages
Your undelivered messages will get buffered between the Pub/Sub service and the client library. Messages might get stuck in a single client's buffer, or get redelivered to the same client if the ackDeadline was exceeded.
You can experiment with using the synchronous pull as suggested.

GCP Pubsub high latency on low message/sec

I'm publishing Pubsub messages from AppEngine Flexible environment with the JAVA client library like this:
Publisher publisher = Publisher
.newBuilder(ProjectTopicName.of(Utils.getApplicationId(), "test-topic"))
.setBatchingSettings(
BatchingSettings.newBuilder()
.setIsEnabled(false)
.build())
.build();
publisher.publish(PubsubMessage.newBuilder()
.setData(ByteString.copyFromUtf8(message))
.putAttributes("timestamp", String.valueOf(System.currentTimeMillis()))
.build());
I'm subscribing to the topic in Dataflow and logging how long it takes for the message to reach Dataflow from AppEngine flexible
pipeline
.apply(PubsubIO.readMessagesWithAttributes().fromSubscription(Utils.buildPubsubSubscription(Constants.PROJECT_NAME, "test-topic")))
.apply(ParDo.of(new DoFn<PubsubMessage, PubsubMessage>() {
#ProcessElement
public void processElement(ProcessContext c) {
long timestamp = System.currentTimeMillis() - Long.parseLong(c.element().getAttribute("timestamp"));
System.out.println("Time: " + timestamp);
}
}));
pipeline.run();
When I'm publishing messages at the rate of a few messages per second then the logs show that the time needed for the message to reach Dataflow is between 100ms and 1.5 seconds.
But when the rate is about 100 messages per second then the time is constantly between 100ms - 200ms, which seems totally adequate.
Can someone explain this behavior. It seems as turning off the publisher batching does not work.

Pub/Sub is designed for high throughput messages for both Subscription cases.
Pull subscription works best when there's large volume of messages, it's the kind of subscription you would use when throughput of message processing is your priority. Specially note that synchronous pull doesn't handle messages as soon as they are published, and can choose to pull and handle a fixed number of messages (more messages, more pulls). A better option would be to use asynchronous pull, which uses a long running message listener and acknowledges one message at a time [1].
On the other hand, Push subscription uses a slow-start algorithm: The number of messages sent is doubled with each successful delivery until it reaches its constraints (more messages, more -and faster- deliveries).
[1] https://cloud.google.com/pubsub/docs/pull#asynchronous-pull

Does Spring Integration RabbitTemplate publish to persistent queue by default?

I have a scheduled task that performs the following bit of code:
try {
rabbitTemplate.convertAndSend("TEST");
if (!isOn()) {
turnOn();
}
}
catch (AmqpException e) {
if (isOn()) {
turnOff();
}
}
Everything works just fine. It sends this message to the default "AMQP default" exchange. I do not have a consumer on the other end to consume these messages because I am just ensuring that the server is still alive. Will these messages accumulate over time and cause a memory leak?
Thanks!
K

Do you have a RabbitMQ user interface?
You should be able to see the queues that are being created and whether they are persistent or not. Last time I checked, the default behaviour of Spring AMQP is to create persistent queues.
Have a look at the RabbitMQ Management Plugin: http://www.rabbitmq.com/management.html
Using the RabbitMQ Management Plugin, you can also consume messages that you've published via your code.
Regarding what happens with the messages, they will just pile up and pile up until RabbitMQ hits its limits, then it will no longer accept messages until you purge the queue or consume those messages. With the default RabbitMQ settings, I was able to send about 4 million simple text messages to the queue before it started blocking.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.