I need to pull data from Kafka consumer to pass it on to my application. Below is the code that I have written to access the consumer:
public class ConsumerGroup {
public static void main(String[] args) throws Exception {
String topic = "kafka_topic";
String group = "0";
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", group);
props.put("enable.auto.commit", "true");
props.put("auto.commit.interval.ms", "1000");
props.put("session.timeout.ms", "30000");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("auto.offset.reset", "earliest");
KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(props);
consumer.subscribe(Arrays.asList(topic));
System.out.println("Subscribed to topic: " + topic);
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records)
System.out.printf("offset = %d, key = %s, value = %s\n", record.offset(), record.key(), record.value());
}
}
}
When I run this code, sometimes the data is getting generated and sometimes no data is generated. Why this behavior is inconsistent? Is there any issue with my code?
Your code is Ok. You have autocommit option enabled, so after you read the records, they are automatically committed to Kafka. Every time when you run the code you start from the last processed offset, which is stored in __consumer_offsets topic. So you always read only the new records, which have arrived to Kafka after last run. To print the data constantly in the consumer app, you should put constantly new records into your topic.
Related
I have the following code to connect to Kafka
Properties props = new Properties();
props.put("bootstrap.servers", "myconfluentkafkabroker:9092");
props.put("group.id","test");
props.put("enable.auto.commit","true");
props.put("auto.commit.interval.ms","1000");
props.put(ConsumerConfig.GROUP_ID_CONFIG, "my_CG");
props.put("group.instance.id", "my_instance_CG_id");
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.put("key.deserializer", Class.forName("org.apache.kafka.common.serialization.StringDeserializer"));
props.put("value.deserializer", Class.forName("org.apache.kafka.common.serialization.StringDeserializer"));
KafkaConsumer<String,String> consumer = new KafkaConsumer<String,String>(props);
consumer.subscribe(Arrays.asList("MyTopic"));
try {
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records)
{
log.debug("topic = %s, partition = %d, offset = %d,"
customer = %s, country = %s\n",
record.topic(), record.partition(), record.offset(),
record.key(), record.value());
int updatedCount = 1;
if (custCountryMap.countainsKey(record.value())) {
updatedCount = custCountryMap.get(record.value()) + 1;
}
custCountryMap.put(record.value(), updatedCount)
JSONObject json = new JSONObject(custCountryMap);
System.out.println(json.toString(4));
}
}
} finally {
consumer.close();
}
Code didn't throw any errors but I still don't see the consumer listed
would this be an issue?
props.put("group.instance.id", "my_instance_CG_id");
You should verify the information that you see with the built-in tools that Kafka provides like kafka-consumer-groups.sh
You'll also need to actually poll messages and commit offsets, not just subscribe before you will see anything.
Otherwise, for that specific Control Center dashboard, it may require you to add the Monitoring Interceptors into your client
I am trying to run 2 consumers subscribed to 2 different topics. Both the consumer programs run properly when running one at a time, but when running them at the same time, one consumer always displays the exception:
org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured session.timeout.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records.
I followed the suggestions and have set the max.pool.size to 2, and session.timeout.ms to 30000, heartbeat.interval.ms to 1000
Below is my consumer function, this function is same for both the files, only the topic name changes to Test2, and I am running these two functions in 2 different classes running both at the same time.
public void consume()
{
//Kafka consumer configuration settings
List<String> topicNames = new ArrayList<String>();
topicNames.add("Test1");
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "test");
props.put("enable.auto.commit", "false");
props.put("session.timeout.ms", "30000");
props.put("heartbeat.interval.ms", "1000");
props.put("max.poll.records", "2");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(props);
consumer.subscribe(topicNames);
try {
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records) {
System.out.println("Record: "+record.value());
String responseString = "successfull";
if (responseString.equals("successfull")) {
consumer.commitSync();
}
}
}
}
catch (Exception e) {
LOG.error("Exception: ", e);
}
finally {
consumer.close();
}
}
Due to this error, the records are not getting committed in the Kafka topic.
How do I overcome this error?
In your case you need to assign different group IDs to consumer. You are making two consumer with same group ID (that is okay), but calling subscribe twice is not okay.
You are able to run one consumer at a time because you are calling subscribe only once.
If you need any further help let me know. Happy to help.
I have a topic with a single partition and two consumer processes that form a consumer group.
This way a message is always delivered to a single consumer. The StickyAssignor is used to prefer consumers that are already assigned to partition on rebalance.
I've been playing with this setup a bit and discovered that under certain circumstances, the messages are delivered to both consumers, which breaks the purpose of consumer group.
The scenario is following:
Start consumer C1
C1 begins receiving messages
Start consumer C2
C2 doesn't receive any messages - thanks to StickyAssignor strategy that prefers C1
Freeze C1 process - (using Java debugger - stopping all threads)
C2 takes over - begins receiving messages
Unfreeze C1 process
Now both C1 and C2 receive messages despite being in the same group
When using RangeAssignor/RoundRobinAssignor, this does not happen.
Am I missing something or is this a bug in Kafka?
Here's my consumer code:
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "test");
props.put("client.id", consumerId);
props.put("enable.auto.commit", "false");
props.put("auto.commit.interval.ms", "1000");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("partition.assignment.strategy", StickyAssignor.class.getName());
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Collections.singleton("my-events"));
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records)
System.out.printf("offset = %d, key = %s, value = %s%n", record.offset(), record.key(), record.value());
}
i am trying to run a kafka consumer program in order to get messages from topic named "test2"
i am using Kafka 0.9 API
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "test");
props.put("enable.auto.commit", "true");
props.put("auto.commit.interval.ms", "1000");
props.put("session.timeout.ms", "30000");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Arrays.asList("test2"));
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records)
System.out.printf("offset = %d, key = %s, value = %s", record.offset(), record.key(), record.value());
}
}
this code is from the official documentation of the Kafka Consumer API 0.9
the figure below clarify more the situation
Eclipse console
Any suggestions for resolving this issue
Thank you in advance
Just add slf4j jar in your build path and try
i'm trying to do some easy demo in kafka-0.10.0.0.
my producer is ok , but consumer maybe not correct, code as below.
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "group1");
props.put("enable.auto.commit", "false");
props.put("session.timeout.ms", "30000");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Arrays.asList("topictest2"));
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (TopicPartition partition : records.partitions()) {
List<ConsumerRecord<String, String>> partitionRecords = records.records(partition);
for (ConsumerRecord<String, String> record : partitionRecords) {
System.out.println("Thread = "+Thread.currentThread().getName()+" ");
System.out.printf("partition = %d, offset = %d, key = %s, value = %s",record.partition(), record.offset(), record.key(), record.value());
System.out.println("\n");
}
// consumer.commitSync();
long lastOffset = partitionRecords.get(partitionRecords.size() - 1).offset();
consumer.commitSync(Collections.singletonMap(partition, new OffsetAndMetadata(lastOffset + 1)));
}
}
but when i run this demo, No output! what is the problem in my code?
It looks valid.
I think the program is just waiting for new messages because
auto.offset.reset default is latest
If you have some messages in that topic and want to read them, try to add
props.put("auto.offset.reset", "earliest");
to start reading topic from beginning and reset your group.id to something unique to make sure it won't continue from saved offset or do not commit offset at all. Once its there for a group id the auto.offset.reset is skipped.
props.put("group.id", "group."+UUID.randomUUID().toString());