I am using consumer.poll(Duration d) for fetching the records. I have only 10 records for testing purpose in Kafka topic spread across 6 partitions. I have disabled auto commit and not committing manually either (again for testing purpose only). When poll is executed it is not fetching data from all partitions. I need to run the poll in a loop to get all the data. I haven't overridden the parameters like max.poll.size or max.fetch.bytes from its default values. What could be the reason? Please note that I have only this consumer for the given topic and group id hence I hope all the partitions will be assigned to this
private Consumer<String, Object> createConsumer() {
ConsumerFactory<String, Object> consumerFactory = deadLetterConsumerFactory();
Consumer<String, Object> consumer = consumerFactory.createConsumer();
consumer.subscribe(Collections.singletonList(kafkaConfigProperties.getDeadLetterTopic()));
return consumer;
}
try {
consumer = createConsumer();
ConsumerRecords<String, Object> records = consumer.poll(Duration.ofMillis(5000));
processMessages (records , .,....);
} catch (Exception e) {
....
} finally {
if (consumer != null) {
consumer.unsubscribe();
consumer.close();
}
}
EDIT
Here is the details
ConsumerFactory<String, Object> deadLetterConsumerFactory() {
Properties properties = new Properties();
properties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, server);
properties.put(SCHEMA_REGISTRY_URL, url);
properties.put(ProducerConfig.CLIENT_ID_CONFIG,
"myid" + "-" + CONSUMER_CLIENT_ID_SEQUENCE.getAndIncrement());
properties.put(SSL_ENDPOINT_IDFN_ALGM, alg);
properties.put(SaslConfigs.SASL_MECHANISM, saslmech);
properties.put(REQUEST_TIMEOUT, timeout);
properties.put(SaslConfigs.SASL_JAAS_CONFIG, config);
properties.put(SECURITY_PROTOCOL, protocol);
properties.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
properties.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
properties.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "false");
consumerProperties.put(ConsumerConfig.GROUP_ID_CONFIG, "groupid");
consumerProperties.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
consumerProperties.forEach((key, value) -> {
map.put((String) key, value);
});
return new DefaultKafkaConsumerFactory<>(map);
}
Related
Our Spring Kafka consumer group run on multiple instances, each instance with about 10 concurrent processes (e.g. multiple consumers).
The problem is that sometimes, maybe due to stuck / long processes - some of the kafka consumers get kicked out of the group - and therefore, the consumer group gets smaller and smaller, until it becomes practically not operative.
The primary symptom is, of course, a frequent rebalance, and a shrinking consumer group size. (Notice the very long max.poll.interval.ms, as we lost hope at some point...)
To battle rebalancing, we've used CooperativeStickyAssignor and made sure our consumers have a static group.instance.id; We're running on a StatefulSet on K8S (GKE autopilot)
The question is, how do we get the consumers to re-join the group, or somehow compensate on getting kick out of the consumer group?
Here's our configuration:
public Map<String, Object> getConsumerProps(String groupId, Class keyDeserializerClass, Class valueDeserializerClass) {
Map<String, Object> props = new HashMap<>();
props.put(
ConsumerConfig.GROUP_ID_CONFIG,
groupId);
props.put(
ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
keyDeserializerClass);
props.put(
ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
valueDeserializerClass);
props.put(
ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,
kafkaAddress);
props.put(
ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG,
30*60*1000);
props.put(
ConsumerConfig.HEARTBEAT_INTERVAL_MS_CONFIG,
5000);
props.put(
ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG,
29*60*1000);
props.put(
ConsumerConfig.REQUEST_TIMEOUT_MS_CONFIG,
30*60*1000);
props.put(
ConsumerConfig.MAX_POLL_RECORDS_CONFIG,
5);
props.put(
ConsumerConfig.FETCH_MAX_WAIT_MS_CONFIG,
3*60*1000);
props.put(
ConsumerConfig.PARTITION_ASSIGNMENT_STRATEGY_CONFIG,
org.apache.kafka.clients.consumer.CooperativeStickyAssignor.class.getName());
props.put(
ConsumerConfig.GROUP_INSTANCE_ID_CONFIG,
serverManagementService.getPodId());
LOGGER.info(String.format("Pod id is %s", serverManagementService.getPodId()));
return props;
}
Consumer Factory
#Bean
public ConsumerFactory<String, String> externalResourceIndexerKafkaConsumerFactory() {
Map<String, Object> props = kafkaTopicConfig.getConsumerProps("externalResourceIndexer", StringDeserializer.class, StringDeserializer.class);
return new DefaultKafkaConsumerFactory<>(props);
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, String> externalResourceIndexerKafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(externalResourceIndexerKafkaConsumerFactory());
factory.setConcurrency(10);
return factory;
}
The KafkaListener
#Override
#Transactional
#KafkaListener(
id = "externalResourceIndexer",
autoStartup = "false",
topics = "externalResourceIndexer",
containerFactory = "externalResourceIndexerKafkaListenerContainerFactory")
public void run(String payload) throws JsonProcessingException {
// ...
}
Here's how we start the KafkaListener (yes, it's a bit of a hack)
// schedule to start kafka
#Override
#Scheduled(initialDelay = 1000 * 10, fixedDelay = Long.MAX_VALUE)
public synchronized void loadKafka() {
kafkaListenerEndpointRegistry.getListenerContainer("externalResourceIndexer").start();
}
}
Currently consumer is working as expected, we are not exactly sure why, as no configuration was changed. Will update the Question once we have more information
We are deploying a new Kafka pipeline to process data. We have two consumer listening to same topic. One consumer is working as expected and other is not.
The expectation is only new messages should be processed by both listerner. Listner 1 reprocesses all the messages again and again (not expected), while listerner 2 processes only the new messages (Expected). We are not exactly sure why this is happening, as configuration is similar for both Listners, and they are listening to same topic.
Avg processing time for Listner 1= 4-5 seconds.
Avg processing time for Listner 2= 3-4 Seconds.
Kafka consumer factory.
#Configuration
#EnableKafka
public class KafkaConsumerConfig {
#Bean
public ConsumerFactory<String,Object> consumerFactory(){
Map<String,Object> configs=new HashMap<>();
String bootstrap=System.getenv().getOrDefault("BOOTSTRAP_SERVERS_CONFIG","10.99.2.135:9093,10.99.2.136:9093,10.99.2.134:9093");
configs.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrap);
configs.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, "60000");
configs.put(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG, "60000");
configs.put(ConsumerConfig.HEARTBEAT_INTERVAL_MS_CONFIG, "20000");
configs.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, true);
configs.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG,"latest");
configs.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
configs.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, JsonDeserializer.class);
//configs.put(JsonDeserializer.TRUSTED_PACKAGES, "*");
configs.put(ConsumerConfig.GROUP_ID_CONFIG, "update");
return new DefaultKafkaConsumerFactory<>(configs,new StringDeserializer(),new JsonDeserializer<>(Object.class));
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, Object> kafkaListenerContainerFactory(){
ConcurrentKafkaListenerContainerFactory<String, Object> factory = new ConcurrentKafkaListenerContainerFactory<String, Object>();
factory.setConsumerFactory(consumerFactory());
factory.setConcurrency(5);
factory.getContainerProperties().setPollTimeout(10000);
return factory;
}
#Bean
public ProducerFactory<String, String> producerConfigs() {
Map<String, Object> props = new HashMap<>();
String bootstrap=System.getenv().getOrDefault("BOOTSTRAP_SERVERS_CONFIG","10.99.2.135:9093,10.99.2.136:9093,10.99.2.134:9093");
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,bootstrap);
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,StringSerializer.class);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,JsonSerializer.class);
return new DefaultKafkaProducerFactory<>(props);
}
#Bean
public KafkaTemplate<String, String> kafkaTemplate() {
return new KafkaTemplate<>(producerConfigs());
}
}
Listener 1, reprocesses all message after some time.
#KafkaListener(groupId="updateCache",topics="omp-orderBuffApi",containerFactory="kafkaListenerContainerFactory")
public void consume(Object object){
long time = System.currentTimeMillis();
logger.info("Start Time of consumer for Details Cache: {}" ,time);
ConsumerRecord<String, Object> consumerRecord = (ConsumerRecord<String, Object>) object;
LinkedHashMap<String, Object> map=(LinkedHashMap<String, Object>) consumerRecord.value();
logger.info("Payload for Details Cache : {}" ,map.get("payload"));
Map<String,Object> payload=(Map<String, Object>) map.get("payload");
String order = (String) payload.get("ORDERITEMNUMBER");
logger.info("OrderItemNumber : {}",order);
long atime=System.currentTimeMillis();
logger.info("Start Time of api - Details Cache: {}" ,atime);
solrController.updateOrderDetailsCache1(payload);
float atimex = (System.currentTimeMillis() - atime)/1000F;
logger.info("End Time of api - Details Cache: {}" ,atimex);
float timex = (System.currentTimeMillis() - time)/1000F;
logger.info("End Time of consumer for Details Cache: {}" ,timex);
logger.info("***************************************************");
countDownLatch0.countDown();
}
Listener 2, working as expected, not reprocessing any message again.
#KafkaListener(groupId = "updateSolr", topics = "omp-orderBuffApi", containerFactory = "kafkaListenerContainerFactory")
public void consumerForSolr(Object object) {
long time = System.currentTimeMillis();
logger.info("Start Time of consumer for Sorl Details Cache: {}", time);
ConsumerRecord<String, Object> consumerRecord = (ConsumerRecord<String, Object>) object;
LinkedHashMap<String, Object> map = (LinkedHashMap<String, Object>) consumerRecord.value();
Map<String, Object> payload = (Map<String, Object>) map.get("payload");
logger.info("payload for Solr Details : {} ", payload);
String order = (String) payload.get("ORDERITEMNUMBER");
logger.info("OrderItemNumber Sorl Details Cache: {} ", order);
long atime = System.currentTimeMillis();
logger.info("Start Time of api - Solr Details Cache: {}", atime);
try {
solrController.updateSolrDate(payload);
} catch (SolrServerException | IOException | ParseException e) {
e.printStackTrace();
}
float atimex = (System.currentTimeMillis() - atime) / 1000F;
logger.info("End Time of api - Solr Details Cache: {}", atimex);
float timex = (System.currentTimeMillis() - time) / 1000F;
logger.info("end Time of consumer - Solr Details Cache: {}", timex);
logger.info("***************************************************");
countDownLatch3.countDown();
}
I use following code to create one producer which produces around 2000 messages.
public class ProducerDemoWithCallback {
public static void main(String[] args) {
final Logger logger = LoggerFactory.getLogger(ProducerDemoWithCallback.class);
String bootstrapServers = "localhost:9092";
Properties properties = new Properties();
properties.setProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
properties.setProperty(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
properties.setProperty(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
// create the producer
KafkaProducer<String, String> producer = new KafkaProducer<String, String>(properties);
for (int i=0; i<2000; i++ ) {
// create a producer record
ProducerRecord<String, String> record =
new ProducerRecord<String, String>("TwitterProducer", "Hello World " + Integer.toString(i));
// send data - asynchronous
producer.send(record, new Callback() {
public void onCompletion(RecordMetadata recordMetadata, Exception e) {
// executes every time a record is successfully sent or an exception is thrown
if (e == null) {
// the record was successfully sent
logger .info("Received new metadata. \n" +
"Topic:" + recordMetadata.topic() + "\n" +
"Partition: " + recordMetadata.partition() + "\n" +
"Offset: " + recordMetadata.offset() + "\n" +
"Timestamp: " + recordMetadata.timestamp());
} else {
logger .error("Error while producing", e);
}
}
});
}
// flush data
producer.flush();
// flush and close producer
producer.close();
}
}
I want to count those messages and get int value.
I use this command and it works, but i am trying to get this count using code.
"bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:9092 --topic TwitterProducer --time -1"
and the result is
- TwitterProducer:0:2000
My code to do the same programmatically looks something like this, but I'm not sure if this is the correct way to get the count:
int valueCount = (int) recordMetadata.offset();
System.out.println("Offset value " + valueCount);
Can someone help me to get count of Kafka messages offset value using code.
You can have a look at implementation details of GetOffsetShell.
Here is a simplified code re-written in Java:
import org.apache.kafka.clients.consumer.ConsumerConfig;
import org.apache.kafka.clients.consumer.KafkaConsumer;
import org.apache.kafka.common.TopicPartition;
import org.apache.kafka.common.serialization.StringDeserializer;
import java.util.*;
import java.util.stream.Collectors;
public class GetOffsetCommand {
private static final Set<String> TopicNames = new HashSet<>();
static {
TopicNames.add("my-topic");
TopicNames.add("not-my-topic");
}
public static void main(String[] args) {
TopicNames.forEach(topicName -> {
final Map<TopicPartition, Long> offsets = getOffsets(topicName);
new ArrayList<>(offsets.entrySet()).forEach(System.out::println);
System.out.println(topicName + ":" + offsets.values().stream().reduce(0L, Long::sum));
});
}
private static Map<TopicPartition, Long> getOffsets(String topicName) {
final KafkaConsumer<String, String> consumer = makeKafkaConsumer();
final List<TopicPartition> partitions = listTopicPartitions(consumer, topicName);
return consumer.endOffsets(partitions);
}
private static KafkaConsumer<String, String> makeKafkaConsumer() {
final Properties props = new Properties();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
props.put(ConsumerConfig.GROUP_ID_CONFIG, "get-offset-command");
return new KafkaConsumer<>(props);
}
private static List<TopicPartition> listTopicPartitions(KafkaConsumer<String, String> consumer, String topicName) {
return consumer.listTopics().entrySet().stream()
.filter(t -> topicName.equals(t.getKey()))
.flatMap(t -> t.getValue().stream())
.map(p -> new TopicPartition(p.topic(), p.partition()))
.collect(Collectors.toList());
}
}
which produces the offset for each topic's partition and sum (total number of messages), like:
my-topic-0=184
my-topic-2=187
my-topic-4=189
my-topic-1=196
my-topic-3=243
my-topic:999
Why do you want to get that value? If you share more detail about the purpose, I can give you more good tip.
For your last question, it's not the correct way to get the count of messages with the offset value. If your topic has one partition and the producer is one, you can use it. You need to consider that the topic has several partitions.
If you want to get the number of messages from each producer, you can count it in the callback function that is onCompletion()
Or you can get the last offset using Consumer client like this:
Properties props = new Properties();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "your-brokers");
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
Consumer<Long, String> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Collections.singletonList("topic_name");
Collection<TopicPartition> partitions = consumer.assignment();
consumer.seekToEnd(partitions);
for(TopicPartition tp: partitions) {
long offsetPosition = consumer.position(tp);
}
I have a consume-transform-produce workflow in a micro service using Spring(boot) Kafka. I need to achieve the exactly-once scemantics provided by Kafka transaction.
Here's the code snippet below:
Config
#Bean
public ProducerFactory<String, String> producerFactory() {
Map<String, Object> props = new HashMap<>();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
props.put(ProducerConfig.MAX_REQUEST_SIZE_CONFIG, 1024 * 1024);
DefaultKafkaProducerFactory<String, String> defaultKafkaProducerFactory = new DefaultKafkaProducerFactory<>(props);
defaultKafkaProducerFactory.setTransactionIdPrefix("kafka-trx-");
return defaultKafkaProducerFactory;
}
#Bean
public ConsumerFactory<String, String> consumerFactory() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, 5000);
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
props.put(ConsumerConfig.ISOLATION_LEVEL_CONFIG, "read_committed");
return new DefaultKafkaConsumerFactory<>(props);
}
#Bean
public KafkaTemplate<String, String> kafkaTemplate() {
return new KafkaTemplate<>(producerFactory());
}
#Bean
public KafkaTransactionManager<String, String> kafkaTransactionManager() {
return new KafkaTransactionManager<>(producerFactory());
}
#Bean
#Qualifier("chainedKafkaTransactionManager")
public ChainedKafkaTransactionManager<String, Object> chainedKafkaTransactionManager(KafkaTransactionManager<String, String> kafkaTransactionManager) {
return new ChainedKafkaTransactionManager<>(kafkaTransactionManager);
}
#Bean
public ConcurrentKafkaListenerContainerFactory<?, ?> concurrentKafkaListenerContainerFactory(ChainedKafkaTransactionManager<String, Object> chainedKafkaTransactionManager) {
ConcurrentKafkaListenerContainerFactory<String, String> concurrentKafkaListenerContainerFactory = new ConcurrentKafkaListenerContainerFactory<>();
concurrentKafkaListenerContainerFactory.setConsumerFactory(consumerFactory());
concurrentKafkaListenerContainerFactory.setBatchListener(true);
concurrentKafkaListenerContainerFactory.setConcurrency(nexusConsumerConcurrency);
//concurrentKafkaListenerContainerFactory.setReplyTemplate(kafkaTemplate());
concurrentKafkaListenerContainerFactory.getContainerProperties().setAckMode(AbstractMessageListenerContainer.AckMode.BATCH);
concurrentKafkaListenerContainerFactory.getContainerProperties().setTransactionManager(chainedKafkaTransactionManager);
return concurrentKafkaListenerContainerFactory;
}
Listener
#KafkaListener(topics = "${kafka.xxx.consumerTopic}", groupId = "${kafka.xxx.consumerGroup}", containerFactory = "concurrentKafkaListenerContainerFactory")
public void listen(#Payload List<String> msgs, #Header(KafkaHeaders.RECEIVED_PARTITION_ID) List<Integer> partitions, #Header(KafkaHeaders.OFFSET) List<Integer> offsets) {
int i = -1;
for (String msg : msgs) {
++i;
LOGGER.debug("partition={}; offset={}; msg={}", partitions.get(i), offsets.get(i), msg);
String json = transform(msg);
kafkaTemplate.executeInTransaction(kt -> kt.send(producerTopic, json));
}
}
However in the product environment, I encounter a weird problem. The offset is increased by two per message sent by the producer and consumer doesn't commit the consuming offset.
Consumer Offsets from topic1
Topic1 consumer detail
Produce to topic2
However the count of messages sent by the producer is the same as the consumed. The downstream of the producer can receive the msgs from topic2 continuously. There's no error or exception found in the log.
I wonder why consume-transform-produce workflow seems ok(exactly-once scemantics also guaranteed), but the consumed offset isn't committed and the produced msg offset increment is two instead of 1 for per single msg.
How to fix it? Thx!
That's the way it's designed. Kafka logs are immutable so an extra "slot" is used at the end of the transaction to indicate whether the transaction was committed or rolled back. This allows consumers with read_committed isolation level to skip over rolled-back transactions.
If you publish 10 records in a transaction, you will see the offset increase by 11. If you only publish one, it will increase by two.
if you want the publish to participate in the consumer-started transaction (for exactly-once), you should not be using executeInTransaction; that will start a new transaction.
/**
* Execute some arbitrary operation(s) on the operations and return the result.
* The operations are invoked within a local transaction and do not participate
* in a global transaction (if present).
* #param callback the callback.
* #param <T> the result type.
* #return the result.
* #since 1.1
*/
<T> T executeInTransaction(OperationsCallback<K, V, T> callback);
I don't see why the consumer offset would not be still sent to the consumer-started transaction though. You should turn on DEBUG logging to see what's happening (if it still happens after you fix the template code).
EDIT
The consumed offset (+1) is sent to the transaction by the listener container when the listener exits; turn on commit logging and you will see it...
#SpringBootApplication
public class So59152915Application {
public static void main(String[] args) {
SpringApplication.run(So59152915Application.class, args);
}
#Autowired
private KafkaTemplate<String, String> template;
#KafkaListener(id = "foo", topics = "so59152915-1", clientIdPrefix = "so59152915")
public void listen1(String in, #Header(KafkaHeaders.OFFSET) long offset) throws InterruptedException {
System.out.println(in + "#" + offset);
this.template.send("so59152915-2", in.toUpperCase());
Thread.sleep(2000);
}
#KafkaListener(id = "bar", topics = "so59152915-2")
public void listen2(String in) {
System.out.println(in);
}
#Bean
public NewTopic topic1() {
return new NewTopic("so59152915-1", 1, (short) 1);
}
#Bean
public NewTopic topic2() {
return new NewTopic("so59152915-2", 1, (short) 1);
}
#Bean
public ApplicationRunner runner(KafkaListenerEndpointRegistry registry) {
return args -> {
this.template.executeInTransaction(t -> {
IntStream.range(0, 11).forEach(i -> t.send("so59152915-1", "foo" + i));
try {
System.out.println("Hit enter to commit sends");
System.in.read();
}
catch (IOException e) {
e.printStackTrace();
}
return null;
});
};
}
}
#Component
class Configurer {
Configurer(ConcurrentKafkaListenerContainerFactory<?, ?> factory) {
factory.getContainerProperties().setCommitLogLevel(Level.INFO);
}
}
and
spring.kafka.producer.transaction-id-prefix=tx-
spring.kafka.consumer.properties.isolation.level=read_committed
spring.kafka.consumer.auto-offset-reset=earliest
and
foo0#56
2019-12-04 10:07:18.551 INFO 55430 --- [ foo-0-C-1] essageListenerContainer$ListenerConsumer : Sending offsets to transaction: {so59152915-1-0=OffsetAndMetadata{offset=57, leaderEpoch=null, metadata=''}}
foo1#57
FOO0
2019-12-04 10:07:18.558 INFO 55430 --- [ bar-0-C-1] essageListenerContainer$ListenerConsumer : Sending offsets to transaction: {so59152915-2-0=OffsetAndMetadata{offset=63, leaderEpoch=null, metadata=''}}
2019-12-04 10:07:20.562 INFO 55430 --- [ foo-0-C-1] essageListenerContainer$ListenerConsumer : Sending offsets to transaction: {so59152915-1-0=OffsetAndMetadata{offset=58, leaderEpoch=null, metadata=''}}
foo2#58
Please pay attention for your auto commit setup. As I see you set it false:
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
so, in this situation you need to commit "manually" or set the auto commit true.
I'm a bit confused about the poll() behaviour of (Spring) Kafka after/when stopping the ConcurrentMessageListenerContainer.
What I want to achieve:
Stop the consumer after an exception was raised (for example message could not be saved to the database), do not commit offset, restart it after a given time and start processing again from the previously failed message.
I read this article which says that the container will call the listener with the remaining records from the poll (https://github.com/spring-projects/spring-kafka/issues/451) which means that there is no guarantee that after the failed message a further message which was processed successfully will commit the offset. This could end up in lost/skipped messages.
Is this really the case and if yes is there a solution to solve this without upgrading the newer versions? (DLQ is not a solution for my case)
What I already did:
Setting the setErrorHandler() and setAckOnError(false)
private Map<String, Object> getConsumerProps(CustomKafkaProps kafkaProps, Class keyDeserializer) {
Map<String, Object> props = new HashMap<>();
//Set common props
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, kafkaProps.getBootstrapServers());
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, ByteArrayDeserializer.class);
props.put(ConsumerConfig.GROUP_ID_CONFIG, kafkaProps.getConsumerGroupId());
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest"); // Start with the first message when a new consumer group (app) arrives at the topic
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false); // We will use "RECORD" AckMode in the Spring Listener Container
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, keyDeserializer);
if (kafkaProps.isSslEnabled()) {
props.put(CommonClientConfigs.SECURITY_PROTOCOL_CONFIG, "SSL");
props.put("ssl.keystore.location", kafkaProps.getKafkaKeystoreLocation());
props.put("ssl.keystore.password", kafkaProps.getKafkaKeystorePassword());
props.put("ssl.key.password", kafkaProps.getKafkaKeyPassword());
}
return props;
}
Consumer
public ConcurrentMessageListenerContainer<String, byte[]> kafkaReceiverContainer(CustomKafkaProps kafkaProps) throws Exception {
StoppingErrorHandler stoppingErrorHandler = new StoppingErrorHandler();
ContainerProperties containerProperties = new ContainerProperties(...);
containerProperties.setAckMode(AbstractMessageListenerContainer.AckMode.RECORD);
containerProperties.setAckOnError(false);
containerProperties.setErrorHandler(stoppingErrorHandler);
ConcurrentMessageListenerContainer<String, byte[]> container = ...
container.setConcurrency(1); //use only one container
stoppingErrorHandler.setConcurrentMessageListenerContainer(container);
return container;
}
Error Handler
public class StoppingErrorHandler implements ErrorHandler {
#Setter
private ConcurrentMessageListenerContainer concurrentMessageListenerContainer;
#Value("${backends.kafka.consumer.halt.timeout}")
int consumerHaltTimeout;
#Override
public void handle(Exception thrownException, ConsumerRecord<?, ?> record) {
if (concurrentMessageListenerContainer != null) {
concurrentMessageListenerContainer.stop();
}
new Timer().schedule(new TimerTask() {
#Override
public void run() {
if (concurrentMessageListenerContainer != null && !concurrentMessageListenerContainer.isRunning()) {
concurrentMessageListenerContainer.start();
}
}
}, consumerHaltTimeout);
}
}
What I'm using:
<groupId>org.springframework.integration</groupId>
<artifactId>spring-integration-kafka</artifactId>
<version>2.1.2.RELEASE</version>
<groupId>org.springframework.kafka</groupId>
<artifactId>spring-kafka</artifactId>
<version>1.1.7.RELEASE</version>
without upgrading the newer versions?
2.1 introduced the ContainerStoppingErrorHandler which is a ContainerAwareErrorHandler, the remaining unconsumed messages are discarded (and will be re-fetched when the container is restarted).
With earlier versions, your listener will need to reject (fail) the remaining messages in the batch (or set max.records.per.poll=1).