Does kafka producer flush method is atomic? - java

I am using a JDBC source connector and recently encountered a data duplication issue in Kafka. After some debugging, I found out that the producer was throwing the following error:
WorkerSourceTask{id=connect-name-0} Failed to flush, timed out while waiting for producer to flush outstanding 6590 messages (org.apache.kafka.connect.runtime.WorkerSourceTask:509)
After following this question, I changed the flush timeout and the error is fixed for now.
But I didn't understand why the data is duplicated. Here is my understanding:
Let's assume that there are 20k rows in the source database and the connector starts with offset 0 and batch size is 1000 and the poll interval is 2 hours, flush interval is 1 min and the flush timeout is 5 seconds.
connector/task executes the query and starts getting 1000 rows from DB.
The buffer memory is 32MB so the task buffers the data and tries to flush it every minute. Assume 15k rows are buffered.
As data is huge, it fails to flush data and commit offset within 5 seconds.
Now there are 3 scenarios:
if flush is atomic, no offsets will be committed so no chance of duplication when the task restarts before the next commit cycle.
If the connector pushes data to the topic but fails to commit offset because of flush timeout. So out of 15K, 10k was written to the topic but the offset is still 0.
If the flush times out, it rollbacks the data written as well as offset commit.
Which scenario is applicable in this case?

Related

Kafka consumer on Kubernetes stops with commit offset error with no retry

I have a java consumer running in kubernetes (consumer v2.7, protocol: sticky) that accesses kafka (3.0.0) in the same kube cluster as well, recently I put the consumer pod on ephemeral nodes, the pods are restarting quite often now, several times a day, it shouldn't be a particular problem. Except that I've noticed that the consumption stops quite often during restarts with errors like this
Unexpected error caught when committing consumer offsets before shutdown.
<#903c7708> o.a.k.c.c.CommitFailedException: Offset commit cannot be completed since the consumer is not part of an active group for auto partition assignment; it is likely that the consumer was kicked out of the group.
[Consumer clientId=consumer-..., groupId=...] Offset commit failed on partition my-topic-event-3 at offset 102006227: The coordinator is not aware of this member.
I've seen that this kind of errors could be linked to long batch processing times, except that messages are processed in 40 ms on average (It depends of the topic consumed) and we have a session.timeout.ms = 10000 and heartbeat.interval.ms = 3000
I thought of trying to start by changing the session timeout to 45 sec as this is the default value in recent versions of the client.
I also thought of try to Increase the polling time max.poll.interval.ms or decrease the number of records per poll() because sometimes we have to wait for third party apis ack before committing in Kafka
Another potential optimization would be to switch consumers from the sticky protocol I currently use to the protocol that is more suited to kubernetes and the volatility of consumer pods (CooperativeStickyAssignor for instance?)
Does anyone have an opinion on this? How to further investigate these concerns?
Thanks

For how long does a kafka producer stay alive between messages?

I am opening a kafka producer with config properties -
KafkaProducer<String, MyValue> producer = new KafkaProducer<String, MyValue>(kafkaProperties);
then sending records synchronously using - (so as to avoid batching and also maintain the original message order)
//create myValue instance //omited for simplicity
//create myrecord instance using topicname and myvalue
producer.send(myRecord).get();
producer.flush(); //send message as soon as record is available to producer
now my issue is, I have several records to send and between sends i might have to wait for long times - few minutes to hours (for what ever reasons, atleast to explore and understand kafka better).
I want to know for how long will the producer connection with the cluster/bootstrap server be alive. Is there anyway i can configure it using the producer configurations.
(In depth explanations will be greatly thanked - even if it has to go to tcp connection levels, you are welcome)
(kafka consumers have a heartbeat concept. Does producers have similar concept. A google search for "kafka producer heartbeat.interval.ms" returned only result for consumer).
KafkaProducer.send method is asynchronous, by default it adds all records into buffer memory and send them at once, so according docs the producer establish the connection while sending the batch to cluster
The send() method is asynchronous. When called it adds the record to a buffer of pending record sends and immediately returns. This allows the producer to batch together individual records for efficiency.
The producer maintains buffers of unsent records for each partition. These buffers are of a size specified by the batch.size config. Making this larger can result in more batching, but requires more memory (since we will generally have one of these buffers for each active partition).
By default a buffer is available to send immediately even if there is additional unused space in the buffer. However if you want to reduce the number of requests you can set linger.ms to something greater than 0.
This will instruct the producer to wait up to that number of milliseconds before sending a request in hope that more records will arrive to fill up the same batch. This is analogous to Nagle's algorithm in TCP.
For example, in the code snippet above, likely all 100 records would be sent in a single request since we set our linger time to 1 millisecond. However this setting would add 1 millisecond of latency to our request waiting for more records to arrive if we didn't fill up the buffer.
Note that records that arrive close together in time will generally batch together even with linger.ms=0 so under heavy load batching will occur regardless of the linger configuration; however setting this to something larger than 0 can lead to fewer, more efficient requests when not under maximal load at the cost of a small amount of latency.
From the KafkaProducer.flush, invoking flush doesn't mean producer send each record to cluster, invoking flush makes all buffered records immediately available to send
Invoking this method makes all buffered records immediately available to send (even if linger.ms is greater than 0) and blocks on the completion of the requests associated with these records. The post-condition of flush() is that any previously sent record will have completed (e.g. Future.isDone() == true). A request is considered completed when it is successfully acknowledged according to the acks configuration you have specified or else it results in an error.

Apache kafka - manual acknowledge(AbstractMessageListenerContainer.AckMode.MANUAL) not working and events replayed on library upgrade

Kafka events getting replayed to consumer repeatedly. I can see following exception -
5-Nov-2019 10:43:25 ERROR [org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.run : 685] :: org.springframework.kafka.KafkaListenerEndpointContainer#2-0-C-1 :: :: Container exception
org.apache.kafka.clients.consumer.CommitFailedException:
Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member.
This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing.
You can address this either by increasing max.poll.interval.ms or by reducing the maximum size of batches returned in poll() with max.poll.records.
But in my case, it's just 1 message which takes more than 30 mins to process so we acknowledge it on receiving. So i don't think no.of records is an issue. I know it can be solved by increasing max.poll.interval.ms but it used to work before upgrade. Trying to figure out what is optimal workaround.
Tried with AbstractMessageListenerContainer.AckMode.MANUAL_IMMEDIATE seems to commit offset immediately and works, but I need to figure out why AbstractMessageListenerContainer.AckMode.MANUAL fails now
Previous working jar versions:
spring-kafka-1.0.5.RELEASE.jar
kafka-clients-0.9.0.1.jar
Current versions (getting above exception):
spring-kafka-1.3.9.RELEASE.jar
kafka-clients-2.2.1.jar
Yes, you must increase max.poll.interval.ms; you can use MANUAL_IMMEDIATE instead to commit the offset immediately (with MANUAL, the commit is enqueued, the actual commit is not performed until the thread exits the listener).
However, this will still not prevent a rebalance because Kafka requires the consumer to call poll() within max.poll.interval.ms.
So I suggest you switch to MANUAL_IMMEDIATE and increase the interval beyond 30 minutes.
With the old version (before 1.3), there were two threads - one for the consumer and one for the listener, so the queued ack was processed earlier. But it was a very complicated threading model which was much simplified in 1.3, thanks to KIP-62, but this side effect was the result.

"Commit failed for offsets" while committing offset asynchronously

I have a kafka consumer from which I am consuming data from a particular topic and I am seeing below exception. I am using 0.10.0.0 kafka version.
LoggingCommitCallback.onComplete: Commit failed for offsets= {....}, eventType= some_type, time taken= 19ms, error= org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured session.timeout.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records.
I added these two extra consumer properties but still it didn't helped:
session.timeout.ms=20000
max.poll.records=500
I am committing offsets in a different background thread as shown below:
kafkaConsumer.commitAsync(new LoggingCommitCallback(consumerType.name()));
What does that error mean and how can I resolve it? Do I need to add some other consumer properties?
Yes, lower max.poll.records. You'll get smaller batches of data but there more frequent calls to poll that will result will help keep the session alive.

Should I COMMIT after every execute batch?

I have a 1 trillion records file. Batch size is 1000 after which the batch is Executed.
Should I commit after each Batch ? Or Commit just once after all the 1 trillion records are executed in Batches of 1000 ?
{
// Loop for 1 Trillion Records
statement.AddBatch()
if (++count % 1000 == 0)
{
statement.executeBatch()
// SHOULD I COMMIT HERE AFTER EACH BATCH ???
}
} // End Loop
// SHOULD I COMMIT HERE ONCE ONLY ????
A commit marks the end of a successful transaction. So the commit should theoretically happen after all rows have been executed successfully.
If the execution statements are completely independent, than every one should have it's own commit (in theory).
But there may be limitations by the database system that require to split up the rows in several batches with their own commit. Since a database has to reserve some space to be able to do a rollback unless changes are committed, the "cost" of a huge transaction size may by very high.
So the answer is: It depends on your requirements, your database and environment.
Mostly it depends what you want to achieve, usually you need to compromise on something to achieve something. For example, I am deleting 3 million records that are no longer being accessed by my users using a stored procedure.
If I execute delete query all at once, a table lock gets escalated and my other users start getting timeout issues in our applications because the table has been locked by SQL Server (I know the question is not specific to SQL Server but could help debug the problem) to give the deletion process better performance, If you have such a case, you will never go for a bigger batch than 5000. (See Lock Escalation Threshold)
With my current plan, I am deleting 3000 rows per batch and only key lock is happening which is good, I am committing after half a million records are processed.
So, if you do not want simultaneous users hitting the table, you can delete the huge number of records if your database server has enough log space and processing speed but 1 Trillion records are a mess. You better proceed with a batch wise deletion or if 1 Trillion records are total records in the table and you want to delete all of those records, then I'd suggest go for a truncate table.

Categories