Chronicle queue POC returned unexpected latency - java

One of our system has a micro service architecture using Apache Kafka as a service bus.
Low latency is a very important factor but reliability and consistency (exactly once) are even more important.
When we perform some load tests we noticed signifiant performance degradation and all investigations pointed to big increases in Kafka topics producer and consumer latencies. No matter how much configuration we changed or more resources we added we could not get rid of the symptoms.
At the moment our needs are processing 10 transactions per second (TPS) and the load test is exercising 20 TPS but as the system is evolving and adding more functionality we know we'll reach a stage when the need will be 500TPS so we started being worried if we can achieve this with Kafka.
As a proof of concept I tried to switch to one of our micro services to use a chronicle-queue instead of a Kafka topic. It was easy to migrate following the avro example as from Chronicle-Queue-Demo git hub repo
public class MessageAppender {
private static final String MESSAGES = "/tmp/messages";
private final AvroHelper avroHelper;
private final ExcerptAppender messageAppender;
public MessageAppender() {
avroHelper = new AvroHelper();
messageAppender = SingleChronicleQueueBuilder.binary(MESSAGES).build().acquireAppender();
}
#SneakyThrows
public long append(Message message) {
try (var documentContext = messageAppender.writingDocument()) {
var paymentRecord = avroHelper.getGenericRecord();
paymentRecord.put("id", message.getId());
paymentRecord.put("workflow", message.getWorkflow());
paymentRecord.put("workflowStep", message.getWorkflowStep());
paymentRecord.put("securityClaims", message.getSecurityClaims());
paymentRecord.put("payload", message.getPayload());
paymentRecord.put("headers", message.getHeaders());
paymentRecord.put("status", message.getStatus());
avroHelper.writeToOutputStream(paymentRecord, documentContext.wire().bytes().outputStream());
return messageAppender.lastIndexAppended();
}
}
}
After configuring that appender we ran a loop to produce 100_000 messages to a chronicle queue. Every message has the same size and the final size of the file was 621MB. It took 22 minutes 20 seconds and 613 milliseconds (~1341seconds) to process write all messages so an average of about 75 message/second.
This was definitely not what we hopped for and so far from latencies advertised in the chronicle documentation that made me believe my approach was not the correct one. I admit that our messages are not small at about 6.36KB/message but i have no doubts storing them in a database would be faster so I still think I am not doing it right.
It is important our messages are process one by one.
Thank you in advance for your inputs and or suggestions.

Hand building the Avro object each time seems a bit of a code smell to me.
Can you create a predefined message -> avro serializer and use that to feed the queue?
Or, just for testing, create one avro object outside the loop and feed that one object into the queue many times. That way you can see if it is the building or the queuing which is the bottleneck.
More general advice:
Maybe attach a profiler and see if you are making an excessive amount of object allocations. Which is particularly bad if they are getting promoted to higher generations.
See if they are your objects or Chronicle Queue ones.
Is your code maxing out your ram or cpu (or network)?

Related

Spring Boot kafkaTemplate consumer message load and processing message

in my application I am using Spring Boot kafkaTemplate for consuming the messages. I am new to kafka with Spring Boot. I have added a consumer as below -
#KafkaListener(topics = "#{'${app.kafka.consumer.topic}'.split(',')}")
public void receivedMessage(ConsumerRecord<String, String> cr, #Payload String message){
log.info("Message received from topic {} ", cr.topic());
//TODO
}
On a topic we will receive near about 200K messages per second. Ones I received message will send to another method for processing which filtered the messages based on certain criteria and then publishing the filtered message to another topic.
My question is, does above #KafkaListener method will handle this load or do I need to do any special treatment like threading or concurrency.
It depends entirely on your workload, the number of cores in your CPU, etc, etc. In general, increasing the concurrency will provide more throughput (as long as you have at least that many partitions on the topic).
But, your downstream code must be completely thread-safe.
Even then, there may be other bottlenecks in your code (DB etc).
If all you are doing is computation, transformation, and publishing to another topic, without doing any other I/O, increasing the concurrency should definitely help.
The only real solution is experimentation and, if you don't get the throughput you need, profile your application.
You can't learn such skills without actually doing it.

Slow message retrieval from SQS

Given a Java AWS Lambda with the following code:
private static final String QUEUE_URL = "https://sqs.us-east-1.amazonaws.com/<ACCT_NUMBER>/<QUEUE_NAME>";
private static final AmazonSQS client = AmazonSQSClientBuilder.standard().build();
private static final int MAX_SQS_MESSAGES = 10;
And:
private List<Message> getMessages() {
return client.receiveMessage(new ReceiveMessageRequest().withQueueUrl(QUEUE_URL)
.withMaxNumberOfMessages(MAX_SQS_MESSAGES).withWaitTimeSeconds(1)).getMessages();
}
I experience rather "long" SQS retrieval times (considering the specified 1 second base for long polling), as sample evidence from logs:
Got 3 SQS msgs: 1985ms
Got 8 SQS msgs: 1887ms
Got 9 SQS msgs: 2438ms
Got 5 SQS msgs: 1748ms
Are those time between normal operation or, could I be doing something wrong or improve something?
Maven dependency:
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-sqs</artifactId>
<version>1.11.488</version>
</dependency>
Those are indeed very long delays, and something is wrong. With a non-empty queue, you should be able to get typical reads in the 5-500ms range (lower with more messages available). Even if your queue is empty, the request times should be topping out around a maximum of 1s based on your usage of withWaitTimeSeconds in the request.
There are a number of steps you can take to narrow down the problem:
Make sure the queue and the lambda are in the same region - I mention this first as I've seen so many latency issues caused by cross-region calls in AWS.
Make sure you have accurate request metrics. I don't see how you are measuring the metrics timing in your code, but I do see how you're constructing your client.
Create an implementation of RequestHandler2 that implements afterError and afterResponse, and examines the details of request.getAWSRequestMetrics()
Add that request handler to the client via clientBuilder.withRequestHandlers(RequestHandler2... handlers)
This will give you accurate details of how the request is spending its time, and perhaps reveal some obvious problems, and may also point to a problem outside of the call to SQS.
Make sure you are reusing your client (and not creating a fresh one each time) - consider logging each time your client is created. Under the hood of the client there's a lot of setup, and if it's using a fresh client every time, there might be a lot of time wasted there.

Mechanism for reading all messages from a queue in one request

I need a solution for the following scenario which is similar to a queue:
I want to write messages to a queue continuously. My message is very big, containing a lot of data so I do want to make as few requests as possible.
So my queue will contain a lot of messages at some point.
My Consumer will read from the queue every 1 hour. (not whenever a new message is written) and it will read all the messages from the queue.
The problem is that I need a way to read ALL the messages from the queue using only one call (I also want the consumer to make as few requests to the queue as possible).
A close solution would be ActiveMQ but the problem is that you can only read one message at a time and I need to read them all in one request.
So my question is.. Would there be other ways of doing this more efficiently? The actual thing that I need is to persist in some way messages created continuously by some application and then consume them (also delete them) by the same application all at once, every 1 hour.
The reason I thought a queue would be fit is because as the messages are consumed they are also deleted but I need to consume them all at once.
I think there's some important things to keep in mind as you're searching for a solution:
In what way do you need to be "more efficient" (e.g. time, monetary cost, computing resources, etc.)?
It's incredibly hard to prove that there are, in fact, no other "more efficient" ways to solve a particular problem, as that would require one to test all possible solutions. What you really need to know is, given your specific use-case, what solution is good enough. This, of course, requires knowing specifically what kind of performance numbers you need and the constraints on acquiring those numbers (e.g. time, monetary cost, computing resources, etc.).
Modern message broker clients (e.g. those shipped with either ActiveMQ 5.x or ActiveMQ Artemis) don't make a network round-trip for every message they consume as that would be extremely inefficient. Rather, they fetch blocks of messages in configurable sizes (e.g. prefetchSize for ActiveMQ 5.x, and consumerWindowSize for ActiveMQ Artemis). Those messages are stored locally in a buffer of sorts and fed to the client application when the relevant API calls are made to receive a message.
Making "as few requests as possible" is rarely a way to increase performance. Modern message brokers scale well with concurrent consumers. Consuming all the messages with a single consumer drastically limits the message throughput as compared to spinning up multiple threads which each have their own consumer. Rather than limiting the number of consumer requests you should almost certainly be maximizing them until you reach a point of diminishing returns.

Limit the throughput of a Reactor Flux reading a Mongodb collection

I am using Spring 5, in detail the Reactor project, to read information from a huge Mongo collection to a Kafka topic. Unfortunately, the production of Kafka messages is much faster than the program that consumes them. So, I need to implement some backpressure mechanism.
Suppose I want a throughput of 100 messages every second. Googling a little, I decided to combine the feature of the buffer(int maxSize) method, zipping the result with a Flux that emits a message using a predefined interval.
// Create a clock that emits an event every second
final Flux<Long> clock = Flux.interval(Duration.ofMillis(1000L));
// Create a buffered producer
final Flux<ProducerRecord<String, Data>> outbound =
repository.findAll()
.map(this::buildData)
.map(this::createKafkaMessage)
.buffer(100)
// Limiting the emission in time interval
.zipWith(clock, (msgs, tick) -> msgs)
.flatMap(Flux::fromIterable);
// Subscribe a Kafka sender
kafkaSender.createOutbound()
.send(outbound)
.then()
.block();
Is there a smarter way to do this? I mean, it seems to me a little bit complex (the zip part, overall).
Yes, you can use delayElements(Duration.ofSeconds(1)) operation directily whitout need to zipWith it. There is always enhancement with reactor cool project as it a continious upgrades so let us be sticky :) hope was helpful!

JMS (ActiveMQ) Performance

I have a Java application with a number of components communicating via JMS (ActiveMQ). Currently the application and the JMS Hub are on the same server although we eventually plan to split out the components for scalability. Currently we are having significant issues with performance, all seemingly around JMS, most notably, and the focus of this question is the amount of time it is taking to publish a message to a topic.
We have around 50 dynamically created topics used for communication between the components of the application. One component reads records from a table and processes them one at a time, the processing involves creating a JMS Object message and publishing it to one of the topics. This processing could not keep up with the rate at which records were being written to the source table ~23/sec, so we changed the processing to create the JMS Object message and add it to a queue. A new thread was created which read from this queue and published the message to the appropriate topic. Obviously this does not speed the processing up but it did allow us to see how far behind we were getting by looking at the size of the queue.
At the start of the day no messages are going through the whole system, this quickly ramps up from 1560000 (433/sec) messages through the hub in the first hour to 2100000 (582/sec) in the 3rd hour and then staying at that level. At the start of the first hour the message publishing from the component reading records from the database table keeps up however, by the end of that hour there are 2000 messages in the queue waiting to be sent and by the 3rd hour the queue has 9000 messages in it.
Below are the appropiate sections of the code which send the JMS messages, any advice on what we are doing wrong or how we can improve this performance are much appreciated. Looking at stats on the web JMS should be able to easily handle ~1000-2000 large messages/sec or ~10000 small messages/sec. Our messages are around 500 bytes each so I imagine sit somewhere in the middle of that scale.
Code for getting the publisher:
private JmsSessionPublisher getJmsSessionPublisher(String topicName) throws JMSException {
if (!this.topicPublishers.containsKey(topicName)) {
TopicSession pubSession = (ActiveMQTopicSession) topicConnection.createTopicSession(false, Session.AUTO_ACKNOWLEDGE);
ActiveMQTopic topic = getTopic(topicName, pubSession);
// Create a JMS publisher and subscriber
TopicPublisher publisher = pubSession.createPublisher(topic);
this.topicPublishers.put(topicName, new JmsSessionPublisher(pubSession, publisher));
}
return this.topicPublishers.get(topicName);
}
Sending the message:
JmsSessionPublisher jmsSessionPublisher = getJmsSessionPublisher(topicName);
ObjectMessage objMessage = jmsSessionPublisher.getSession().createObjectMessage(messageObj);
objMessage.setJMSCorrelationID(correlationID);
objMessage.setJMSTimestamp(System.currentTimeMillis());
jmsSessionPublisher.getPublisher().publish(objMessage, false, 4, 0);
Code which adds messages to the queue:
List<EventQueue> events = eventQueueDao.getNonProcessedEvents();
for (EventQueue eventRow : events) {
IEvent event = eventRow.getEvent();
AbstractEventFactory.EventType eventType = AbstractEventFactory.EventType.valueOf(event.getEventType());
String topic = event.getTopicName() + topicSuffix;
EventMsgPayload eventMsg = AbstractEventFactory.getFactory(eventType).getEventMsgPayload(event);
synchronized (queue) {
queue.add(new QueueElement(eventRow.getEventId(), topic, eventMsg));
queue.notify();
}
}
Code in the thread removing items from the queue:
jmsSessionFactory.publishMessageToTopic(e.getTopic(), e.getEventMsg(), Integer.toString(e.getEventMsg().hashCode()));
publishMessageToTopic executes the 'Sending the message' code above.
Other JMS implementations are an option if the consensus is that ActiveMQ may not be the best option.
Thank you,
James
We do not use ActiveMQ, but we ran into similar issues, we discovered that the issues were with the back-end processing and not with the Java side. There could be multiple issues here:
The program processing the messages from the Queue could be slow (e.g. CICS on mainframe) it might not be able to keep up with the messages that are sent to the queue. One possible solution for this is to increase the processing power (or optimize the back end code which processes the messages)
Check the messages on the queue, sometimes there are are lots of uncommitted poison messages on the queue, we use a separate queue for such messages.
It would nice to know the answers to the questions asked by Karianna.
It's not 100% clear where you are experiencing the slow performance, but it sounds like what you are describing is slowness in publishing the messages. Are you creating a new publisher every time you publish a message? If so, this is terribly inefficient and you should consider creating one publisher and use it over and over to send messages. Furthermore, if you are sending persistent messages, then you are probably using synchronous sends to the broker. You might want to consider using asynchronous sends to speed things up. For more info, see the doc about Async Sends
Also, how is the performance of the consumers? How many consumers are being used? Are they able to keep pace with the rate at which messages are being published?
Additionally, what is the broker configuration that you are using? Has it been tuned at all?
Bruce
Although this is an old question, there is one very very important advice missing:
Investigate the amount of topics and queues that you have.
ActiveMQ keeps subscription topics in separate threads. Particularly, when you have large amounts of different topics, this will drag down any server. Think about using JMS selectors instead.
I ran into a similar situation where I had thousands of market data messages per second. When I naively dumped each message into a market instrument specific channel, the server was able to stand about an hour before it was spitting out error messages to the message producers. I changed the design to have ONE channel "MARKET_DATA" and I then set header properties on all produced messages and set a selector on the consumer side to select just the messages that I want. Note that my selector is in SQL like syntax and runs on the server though ... (yeah, let's skip the CEP marketing hype bashing) ...

Categories