I am trying to run 2 consumers subscribed to 2 different topics. Both the consumer programs run properly when running one at a time, but when running them at the same time, one consumer always displays the exception:
org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured session.timeout.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records.
I followed the suggestions and have set the max.pool.size to 2, and session.timeout.ms to 30000, heartbeat.interval.ms to 1000
Below is my consumer function, this function is same for both the files, only the topic name changes to Test2, and I am running these two functions in 2 different classes running both at the same time.
public void consume()
{
//Kafka consumer configuration settings
List<String> topicNames = new ArrayList<String>();
topicNames.add("Test1");
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "test");
props.put("enable.auto.commit", "false");
props.put("session.timeout.ms", "30000");
props.put("heartbeat.interval.ms", "1000");
props.put("max.poll.records", "2");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(props);
consumer.subscribe(topicNames);
try {
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records) {
System.out.println("Record: "+record.value());
String responseString = "successfull";
if (responseString.equals("successfull")) {
consumer.commitSync();
}
}
}
}
catch (Exception e) {
LOG.error("Exception: ", e);
}
finally {
consumer.close();
}
}
Due to this error, the records are not getting committed in the Kafka topic.
How do I overcome this error?
In your case you need to assign different group IDs to consumer. You are making two consumer with same group ID (that is okay), but calling subscribe twice is not okay.
You are able to run one consumer at a time because you are calling subscribe only once.
If you need any further help let me know. Happy to help.
Related
I have a topic with a single partition and two consumer processes that form a consumer group.
This way a message is always delivered to a single consumer. The StickyAssignor is used to prefer consumers that are already assigned to partition on rebalance.
I've been playing with this setup a bit and discovered that under certain circumstances, the messages are delivered to both consumers, which breaks the purpose of consumer group.
The scenario is following:
Start consumer C1
C1 begins receiving messages
Start consumer C2
C2 doesn't receive any messages - thanks to StickyAssignor strategy that prefers C1
Freeze C1 process - (using Java debugger - stopping all threads)
C2 takes over - begins receiving messages
Unfreeze C1 process
Now both C1 and C2 receive messages despite being in the same group
When using RangeAssignor/RoundRobinAssignor, this does not happen.
Am I missing something or is this a bug in Kafka?
Here's my consumer code:
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "test");
props.put("client.id", consumerId);
props.put("enable.auto.commit", "false");
props.put("auto.commit.interval.ms", "1000");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("partition.assignment.strategy", StickyAssignor.class.getName());
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Collections.singleton("my-events"));
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records)
System.out.printf("offset = %d, key = %s, value = %s%n", record.offset(), record.key(), record.value());
}
I need to pull data from Kafka consumer to pass it on to my application. Below is the code that I have written to access the consumer:
public class ConsumerGroup {
public static void main(String[] args) throws Exception {
String topic = "kafka_topic";
String group = "0";
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", group);
props.put("enable.auto.commit", "true");
props.put("auto.commit.interval.ms", "1000");
props.put("session.timeout.ms", "30000");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("auto.offset.reset", "earliest");
KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(props);
consumer.subscribe(Arrays.asList(topic));
System.out.println("Subscribed to topic: " + topic);
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records)
System.out.printf("offset = %d, key = %s, value = %s\n", record.offset(), record.key(), record.value());
}
}
}
When I run this code, sometimes the data is getting generated and sometimes no data is generated. Why this behavior is inconsistent? Is there any issue with my code?
Your code is Ok. You have autocommit option enabled, so after you read the records, they are automatically committed to Kafka. Every time when you run the code you start from the last processed offset, which is stored in __consumer_offsets topic. So you always read only the new records, which have arrived to Kafka after last run. To print the data constantly in the consumer app, you should put constantly new records into your topic.
I am new in Kafka and I have a question that I'm not able to resolve.
I have installed Kafka and Zookeeper in my own computer in Windows (not in Linux) and I have created a broker with a topic with several partitions (playing between 6 and 12 partitions).
When I create consumers, they works perfectly and read at good speed, but referring producer, I have created the simple producer one can see in many web sites. The producer is inside a loop and is sending many short messages (about 2000 very short messages).
I can see that consumers read the 2000 messages very quicly, but producer sends message to the broker at more or less 140 or 150 messages per second. As I said before, I'm working in my own laptop (only 1 disk), but when I read about millions of messages per second, I think there is something I forgot because I'm light-years far from that.
If I use more producers, the result is worse.
Is a question of more brokers in the same node or something like that? This problem have been imposed to me in my job and I have not the possibility of a better computer.
The code for creating the producer is
public class Producer {
public void publica(String topic, String strKey, String strValue) {
Properties configProperties = new Properties();
configProperties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
configProperties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, LongSerializer.class.getName());
configProperties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
KafkaProducer<String, String> producer = new KafkaProducer<String, String>(configProperties);
ProducerRecord<String, String> rec = new ProducerRecord<String, String>(topic, strValue);
producer.send(rec);
}
}
and the code for sending messages is (partial):
Producer prod = new Producer();
for (int i = 0; i < 2000; i++)
{
key = String.valueOf(i);
prod.publica("TopicName", key, texto + " - " + key);
// System.out.println(i + " - " + System.currentTimeMillis());
}
You may create your Kafka producer once and use it every time you need to send a message:
public class Producer {
private final KafkaProducer<String, String> producer; // initialize in constructor
public void publica(String topic, String strKey, String strValue) {
ProducerRecord<String, String> rec = new ProducerRecord<String, String>(topic, strValue);
producer.send(rec);
}
}
Also take a look at the producer and broker configurations available here. There are several options with which you can tune for your application's needs.
I was using kafka 0.10.2 and now faced a CommitFailedException. like:
Commit cannot be completed since the group has already rebalanced and
assigned the partitions to another member. This means that the time
between subsequent calls to poll() was longer than the configured
max.poll.interval.ms, which typically implies that the poll loop is
spending too much time message processing. You can address this either
by increasing the session timeout or by reducing the maximum size of
batches returned in poll() with max.poll.records.
I have set max.poll.interval.ms to Integer.MAX_VALUE. so can anyone tell me why this still happens even I have set the value ?
Another question is:
I do as description to set session.timeout.ms to 60000 and it still happens. I try to reproduce by a simple code
public static void main(String[] args) throws InterruptedException {
Logger logger = Logger.getLogger(KafkaConsumer10.class);
logger.info("XX");
Properties props = new Properties();
props.put("bootstrap.servers", "kafka-broker:9098");
props.put("group.id", "test");
props.put("enable.auto.commit", "true");
props.put("auto.commit.interval.ms", "1000");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("max.poll.interval.ms", "300000");
props.put("session.timeout.ms", "10000");
props.put("max.poll.records", "2");
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Arrays.asList("t1"));
while (true) {
Thread.sleep(11000);
ConsumerRecords<String, String> records = consumer.poll(100);
//Thread.sleep(11000);
Thread.sleep(11000);
for (ConsumerRecord<String, String> record : records)
System.out.printf("offset = %d, key = %s, value = %s%n", record.offset(), record.key(), record.value());
}
when I set session.timeout.ms to 10000, I try to sleep more than 10000 ms in my poll loop, but it seems work and no Exception out. so I'm confused about this. if heartbeat is triggered by consumer.poll and consumer.commit, seems heartbeat is out of session timeout in my code. why not throw CommitFailedException ?
session.timeout.ms set on the consumer should be less than the group.max.session.timeout.ms set on Kafka broker.
This resolved the issue for me.
Credit to github link Commit Failures
Hi For this you need to handle the rebalancing condition in your code and should process the ongoing message and commit it before rebalancing
Like :
private class HandleRebalance implements ConsumerRebalanceListener {
public void onPartitionsAssigned(Collection<TopicPartition> partitions) {
// Implement what you want to do once rebalancing is done.
}
public void onPartitionsRevoked(Collection<TopicPartition> partitions) {
// commit current method
}
}
and Use this syntax for subscribing the topic :
kafkaConsumer.subscribe(topicNameList , new HandleRebalance())
The advantage of doing this :
Messages will not repeat when the rebalancing is taking place.
No commit fail exception
I am a student researching and playing around with Kafka. After following the examples on the Apache documentation, I'm playing around with the examples portion in the trunk of their current Github repo.
As of right now, the example implements an 'older' version of their Consumer and does not employ the new KafkaConsumer. Following the documentation, I have written my own version of the KafkaConsumer thinking that it would be faster.
This is a vague question, but on runthrough I produce 5000 simple messages such as "Message_CurrentMessageNumber" to a topic "test" and then use my consumer to fetch these messages and print them to stdout. When I run the example code replacing the provided consumer with the newer KafkaConsumer (v 0.8.2 and up) it works pretty quickly and comparably to the example in its first runthrough, but slows down considerably anytime after that.
I notice that my Kafka Server outputs
Rebalancing group group1 generation 3 (kafka.coordinator.ConsumerCoordinator)
or similar messages often which leads me to believe that Kafka has to do some sort of load balancing that slows stuff down but I was wondering if anyone else had insight as to what I am doing wrong.
public class AlternateConsumer extends Thread {
private final KafkaConsumer<Integer, String> consumer;
private final String topic;
private final Boolean isAsync = false;
public AlternateConsumer(String topic) {
Properties properties = new Properties();
properties.put("bootstrap.servers", "localhost:9092");
properties.put("group.id", "newestGroup");
properties.put("partition.assignment.strategy", "roundrobin");
properties.put("enable.auto.commit", "true");
properties.put("auto.commit.interval.ms", "1000");
properties.put("session.timeout.ms", "30000");
properties.put("key.deserializer", "org.apache.kafka.common.serialization.IntegerDeserializer");
properties.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
consumer = new KafkaConsumer<Integer, String>(properties);
consumer.subscribe(topic);
this.topic = topic;
}
public void run() {
while (true) {
ConsumerRecords<Integer, String> records = consumer.poll(100);
for (ConsumerRecord<Integer, String> record : records) {
System.out.println("We received message: " + record.value() + " from topic: " + record.topic());
}
}
// ConsumerRecords<Integer, String> records = consumer.poll(0);
// for (ConsumerRecord<Integer, String> record : records) {
// System.out.println("We received message: " + record.value() + " from topic: " + record.topic());
// }
// consumer.close();
}
}
To start:
package kafka.examples;
public class KafkaConsumerProducerDemo implements KafkaProperties
{
public static void main(String[] args) {
final boolean isAsync = args.length > 0 ? !args[0].trim().toLowerCase().equals("sync") : true;
Producer producerThread = new Producer("test", isAsync);
producerThread.start();
AlternateConsumer consumerThread = new AlternateConsumer("test");
consumerThread.start();
}
}
The producer is the default producer located here: https://github.com/apache/kafka/blob/trunk/examples/src/main/java/kafka/examples/Producer.java
This should not be the case. If the setup is similar between your two consumers you should expect better result with new consumer unless there is issue in the client/consumer implementation, which seems to be the case here.
Can you share your benchmark results and the frequency of reported rebalancing and/or any pattern (i.e. sluggish once at startup, after fixed message consumption, after the queue is drained, etc) you are observing. Also if you can share some details about your consumer implementation.