I have been doing some performance tests with kafka cluster for my project. I have a question regarding the send call and the 'acks' property of producer. I observed below numbers with below invocation of send call. This is a simple fire and forget call.
producer.send(record); // fire and forget call
The topic has 5 partitions and I see below results with different acks value and replication factor. The kafka cluster has 5 nodes running with default values and using local disk
acks Replication factor=1 Replication factor=3
0 1330k msgs/sec 1260k msgs/sec
1 1220k msgs/sec 1200k msgs/sec
-1(all) 1220k msgs/sec 325k msgs/sec
As you can see as the acks value changes from 0 to all, the producer throughput decreases. What I am not able to understand is that if the producer send call is fire and forget in nature (see above) and producer is not waiting for any acknowledgements then why does the producer throughput drops as we move to stronger acks guarantees?
Any insights into how acks and producer send call works internally in Kakfa would be greatly appreciated.
P.S. I had asked this on kafka users mailing list but didn't get a reply so asking this on SO.
The fact that you haven't a callback in the send method doesn't mean that it's fire and forget at the underlying level.
You have configured the producer with 3 different levels of ack which determine the "fire and forget" status or not.
With acks = 0, it means that the producer send the message but doesn't wait for any acks from the broker; it's the real "fire and forget". So as you can see it provides the higher throughput.
With acks = 1, the producer waits for the ack. This ack is sent by the broker (to which the producer is connected and that hosts the leader replica). It's not "fire and forget" of course.
With acks = -1, the producer waits for the ack. This ack is sent by the broker as above but only after having the messages replicated to all the replica followers on the other brokers. Of course in this case the throughput decrease if you increment the replication factor, because the message needs to be copied by more brokers (min.insync.replicas) before the "leader" broker returns back the ack to the producer.
Notice that with replication factor = 1, the ack = 1 and ack = -1 has same throughput because there is just one replica (the leader) so no need to copy to followers.
This is something about how kafka handling the produce request.
First, KafkaProducer.send is async by default. KafkaProducer has taken the heavily working for batch your produce request and send to broker. The broker will acking with produce response which in turn need wait for the min.insync.replicas from the remote followers. That's the reason.
I think the accepted answer is false because the question is about throughput and NOT latency and according to confluent book Kafka: the definitive guide :
If our client code waits for a reply from the server (by calling the
get() method of the Future object returned when sending a message) it
will obviously increase latency significantly (at least by a network
roundtrip). If the client uses callbacks, latency will be hidden, but
throughput will be limited by the number of in-flight messages (i.e.,
how many messages the producer will send before receiving replies from
the server).
So if an asynchronous producer with acks=1,all then the throughput is depending on the max.in.flight.requests.per.connection : The maximum number of unacknowledged requests the client will send on a single connection before blocking
Related
I have requirement to implement healthcheck and as part of that I have to find if producer will be able to publish message and consumer will be able to consumer message, for this I have to check that connection to cluster is working which can be checked using "connection_count" metric but that doesn't give true picture especially for consumer which will be tied to certain brokers on which partition for this consumer is.
Situation with producer is even more tricky as Producer might be publishing the message to any broker which holds the partition for topic on which producer is publishing.
In nutshell, how do I find the health of relevant brokers on producer/consumer sude.
Ultimately, I divide the question into a few checks.
Can you reach the broker? AdminClient.describeCluster works for this
Can you descibe the Topic(s) you are using? AdminClient.describeTopic can do that
Is the ISR list for those topics higher than min.in.sync.replicas? Extrapolate data from (2)
On the producer side, if you set at least acks=1, and there is no ack callback, or you could expose JMX data around the buffer size and if the producer's buffer isn't periodically flushed, then it is not healthy.
For the consumer, look at the conditions under which a rebalance will happen (such as long processing times between polls), then you can quickly identify what it means to be "unhealthy" for them. Attaching partition assignment + rebalance listeners can help here.
Some of these concepts I've written between
dropwizard-kafka (also has Producer and Consumer checks)
remora
I would like to think Spring has something similar
From the API document of Kafka I found a description of the send() method in Apache Kafka(api document of producer):
“The send is asynchronous and this method will return immediately once the record has been stored in the buffer of records waiting to be sent. This allows sending many records in parallel without blocking to wait for the response after each one.”
I’m just wondering how are the records send in parallel? If I have 3 brokers, and on each broker with 3 partitions under the same topic, will Kafka producer send records to the 9 partitions in parallel? Or producer just send records to 3 brokers in parallel? How does producer work in a parallel way?
Kafka client uses a org.apache.kafka.common.requests.ProduceRequest that can carry payloads for multiple partitions at once (see http://kafka.apache.org/protocol.html#The_Messages_Produce).
So it sends (using org.apache.kafka.clients.NetworkClient) in three requests in parallel, to each of (three) brokers, i.e.:
- sends records for topic-partition0, topic-partition1, topic-partition2 to broker 1
- sends records for topic-partition3, topic-partition4, topic-partition5 to broker 2
- sends records for topic-partition6, topic-partition7, topic-partition8 to broker 3
You can control how much batching is done with producer configuration.
(notice I answered with 9 unique partitions, if you meant replicated partitions, you send only to leader - then the replication will handle the propagation).
Yes, the Producer will batch up the messages destined for each partition leader and will be sent in parallel. From the API Docs:
The send() method is asynchronous. When called it adds the record to a
buffer of pending record sends and immediately returns. This allows
the producer to batch together individual records for efficiency.
and
The producer maintains buffers of unsent records for each partition.
These buffers are of a size specified by the batch.size config. Making
this larger can result in more batching, but requires more memory
(since we will generally have one of these buffers for each active
partition).
Here's a diagram to help:
I wanted to understand the relationship between the timeout present in the poll() method and the configuration fetch.max.wait.ms. So , lets say I have the following configuration
fetch.min.bytes= 50
fetch.max.wait.ms= 400 ms
timeout on poll method= 200 ms
So, consider I call the poll method with the above specified timeout. The Consumer sends a fetch request to the Kafka Broker who is the leader for that partition. The Broker has not got enough bytes to send according to the configuration fetch.min.bytes, so it will wait for maximum of 400 milliseconds to respond for enough data to arrive. But I have configured, the timeout to be 200 ms for the poll method, so does that mean, behind the hood, when the fetch request is sent, it only waits for 200 ms for the server to respond and then terminates the connection?
Is that how it will turn out? In this scenario, would it be safe to say, you would always configure your timeout to be -
timeout >= network latency + fetch.max.wait.ms
Also, does Kafka Consumer fetch records proactively? By that I mean, is the consumer busy fetching records behind the hood , when the user code is busy processing the records on the last poll() method call, so that to reduce latency when the next time poll() is called. If yes, how does it maintain this internal buffer? Can we also configure, the maximum size of this internal buffer?
Thank you in advance.
Time out on poll allows you to do asynchronous processing. After subscribing to a set of topics, the consumer will automatically join the group when poll(long) is invoked. The poll API is designed to ensure consumer availability.
As long as the consumer continue to call poll, the consumer will stay in the group and continue to receive messages from the partitions it was assigned.
Under the hood, the consumer sends periodic heartbeats to the server. If the consumer crashes or is unable to send heartbeats for a duration of session.timeout.ms, then the consumer will be considered dead and its partitions will be reassigned.
But we should be careful that the long value in the poll(long) is not too long. This makes the whole process synchronous. You can read the discussion in the below link.
https://github.com/confluentinc/confluent-kafka-dotnet/issues/142
fetch.max.wait.ms This will make sure whenever a fetch request is created the server will block the request until the time specified. This usually kicks in if there isn't sufficient data to immediately satisfy the requirement given by fetch.min.bytes.
Point 1: When there is a fetch request the server blocks your fetch request for 400ms if it does not meet 50bytes.
fetch.min.bytes= 50
fetch.max.wait.ms= 400 ms
Point 2: For every 200ms you consumer sends a heartbeat to avoid rebalance by kafka.
timeout on poll method= 200 ms
When Point 1 happens your consumer is idle but since you did Point 2 the heart beat is sent at every 200ms so rebalance does not occur and you may perform some tasks asynchronously for the next 200ms.
So setting poll() will only make sure that your consumer is not considered dead and fetch.max.wait.ms is to tell the server about how long it need to wait when the fetch request comes. What i mean to say is there is not inherent dependency on the two parameter. poll() is more of the asynchronous way of doing things in your code.
Timeout is based purely on the poll().
According to producer configs, there are: retries and max.in.flight.requests.per.connection. Suppose that retries > 0 and max.in.flight.requests.per.connection > 1.
Can messages arrive out of order within ONE partition of topic (e.g. if first message has retries, but second message delivered to broker with the first attempt)?
Or do out of order only happen across several partitions of topic, but within partition order is preserved?
If you set retries to more than 0 and max.in.flight.requests.per.connection to more than 1, then yes messages can arrive out of order on the broker even if they are for the same partition.
You can also have duplicates if for example a message is correctly added to the Kafka logs and an error happens when sending the response back to the client.
Since Kafka 0.11, you can use the Idempotent producer to solve these 2 issues. See http://kafka.apache.org/documentation/#semantics
As per latest update documentation, you can have maximum 5 max.in.flight.requests.per.connection and Kafka can maintain order for this.
I have a producer which sends persistent messages in batches to a queue leveraging JMS transaction.
I have tested and found that Producer Flow Control is applied when using a batch size of 1. I could see my producer being throttled as per the memory limit I have configured for the queue. Here's my Producer Flow Control configuration:
<policyEntry queue="foo" optimizedDispatch="true"
producerFlowControl="true" memoryLimit="1mb">
</policyEntry>
The number of pending messages in the queue are in control which I see as the evidence for Producer Flow Control in action.
However, when the batch size is increased to 2, I found that this memory limit is not respected and the producer is NOT THROTTLED at all. The evidence being the number of pending messages in the queue continue to increase till it hits the storeUsage limit configured.
I understand this might be because the messages are sent in asynchronous fashion when the batch size is more than 1 even though I haven't explicitly set useAsyncSend to true.
ActiveMQ's Producer Flow Control documentation mentions that to throttle asynchronous publishers, we need to configure Producer Window Size in the producer which shall force the Producer to wait for acknowledgement once the window limit is reached.
However, when I configured Producer Window Size in my producer and attempted to send messages in batches, an exception is thrown and no messages were sent.
This makes me think and ask this question, "Is it possible to configure Producer Window Size while sending persistent messages in batches?".
If not, then what is the correct way to throttle the producers who send persistent messages in batches?
There is not really a way to throttle "max msgs per second" or similar. What you would do is to enable producer flow control and vm cursor, then set the memory limit on that queue (or possibly all queues if you wish) to some reasonable level.
You can decide in the configuration if the producer should hang or throw an exception if the queue memory limit has been reached.
<policyEntry queue="MY.BATCH.QUEUE" memoryLimit="100mb" producerFlowControl="true">
<pendingQueuePolicy>
<vmQueueCursor/>
</pendingQueuePolicy>
</policyEntry>
I found this problem in v5.8.0 but found this to be resolved in v5.9.0 and above.
From v5.9.0 onwards I found PFC is applied out of the box even for producers who send messages asynchronously.
Since batch send (where batch size > 1) is essentially an asynchronous operation, this applies there as well.
But the PFC wiki was confusing as it mentions that one should configure ProducerWindowSize for async producers if PFC were to be applied. However, I tested and verified that this was not needed.
I basically configured a per-destination limit of 1mb and sent messages in batches (with batch size of 100).
My producer was throttled out of the box without any additional configuration. The number of pending messages in the queue didn't increase and was under control.
With a simple Camel consumer consuming the messages (and appending them to a file), I found that with v5.8.0 (where I faced the problem), I could send 100k messages with the payload being 2k in 36 seconds. But most of them ended up as Pending messages.
But with v5.9.0, it took 176 seconds to send the same set of messages testifying the role played by PFC. And the number of pending messages never increased beyond 1000 in my case.
I also tested with v5.10.0 and v5.12.0 (the latest version at the time of writing) which worked as expected.
So if you are facing this problem, chances are that you are running ActiveMQ v5.8.0 or earlier. Simply upgrading to the latest version should solve this problem.
I thank the immensely helpful ActiveMQ mailing list folks for all their suggestions and help.
Thanks #Petter for your answer too. Sorry I didn't mention the version I was using in my question, otherwise I believe you could have spotted the problem straight away.