Kafka consumer in test only works with "auto.offset.reset"="earliest"

Kafka consumer in test only works with "auto.offset.reset"="earliest" - java

I'm struggling to understand my Kafka consumer behaviours in some integration tests.
I have a Spring boot service which uses a default, autowired KafkaTemplate<String, String> to produce messages to a topic. In my integration tests, I create a KafkaConsumer in each test:
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(
Map.of( ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, KAFKA_CONTAINER.getBootstrapServers(),
ConsumerConfig.GROUP_ID_CONFIG, "test-consumer-group-" + UUID.randomUUID(),
ConsumerConfig.GROUP_INSTANCE_ID_CONFIG, UUID.randomUUID().toString() ),
new StringDeserializer(), new StringDeserializer() );
consumer.subscribe( topics );
return consumer;
with the intent of having a test flow that looks something like:
Create a new consumer for the topics we're testing
Perform action under test which sends messages to some topics
Poll the topics of interest and verify the messages are there
Close consumer
My expectation was that since the default behaviour of a new consumer is to have auto.offset.reset set to latest I would only get messages sent after I create the consumer, which looks fine in this case. However my consumer never receives any messages! I have to set the consumer to earliest - but this is problematic since I don't want messages created by other tests interfering.
The messages don't have any kind of unique identifier on them, which makes consuming the entire topic each time a tricky proposition in terms of test verifications.
I've tried various permutations of auto committing, polling before running the test but after subscribing, manual syncs but nothing seems to work - how can manage my test lifecycle as described above (or is it not possible)?
The kafka instance is managed using TestContainers in case that's relevant.

Related

RabbitMQ's receiveAndConvert for clustered environment

I have this method implemented in a SpringBoot application
#Scheduled(fixedDelay = 5000)
public void pullMessage() {
MessageDTO message = null;
try {
message = rabbitTemplate.receiveAndConvert(properties.getQueueName(), new ParameterizedTypeReference<MessageDTO>() {});
// more code here...
}
every 5 seconds I'm pulling a message from RabbitMQ and processing something with it. The application is running on Kubernetes and right now I have to duplicate the POD. In this scenario, could the two pods pull the same message?

If the queue is the same for all the instances, then no: only one consumer takes a message from a queue. That's fundamental purpose of the queue pattern at all.
See AMQP docs for publish-subscribe patterns: https://www.rabbitmq.com/tutorials/tutorial-three-java.html

No only a single instance will process the message at one time, the whole purpose of having multiple consumers is not to have any downtime for the application!
Refer the official documentation of RabbitMQ for more clarification!
https://www.rabbitmq.com/tutorials/tutorial-one-java.html

Should i create NewTopics in each service spring kafka?

I'm using Kafka for sending messages between services. I use NewTopic bean for configuring number of partitions, for example:
#Bean
fun kafkaTopic(kafkaProperties: KafkaProperties): NewTopic = NewTopic(
kafkaProperties.topics.schedulerCalculationTopic.name,
kafkaProperties.topics.schedulerCalculationTopic.partitions,
1
)
My question is simple, should i add this bean into consumer service and producer service or only in one of them?

I would put it in the producer service and then consider the producer as 'owner' of those topics.
But it get a bit complicated if you have a scenario if you would have several producers to the same topic(s).

If you are not creating the topic on the fly, the best practice is to create topic before reading/writing to it.
Rationale is to prevent brokers to create topic whenever they receive metadata fetch request or consume request with the same topic name. Otherwise, if the consumer starts before the producer, you might end up wrong number of partition. (Broker will create your topic with default number of partitions setting.)

Difference between KafkaTemplate and KafkaProducer send method?

My question is in a spring boot microservice using kafka what is appropriate to use KafkaTemplate.send() or KafkaProducer.send()
I have used KafkaConsumer and not KafkaListner to poll the records because KafkaListner was fetching the records as and when they were coming to the topics, I wanted the records to be polled periodically based on business needs.
Have gone through the documentation of KafkaProducer https://kafka.apache.org/10/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html
and Spring KafkaTemplate
https://docs.spring.io/spring-kafka/reference/html/#kafka-template
I am unable to make a decision like what is ideal to use or atleast the reason of using one over the other is unclear?
What my need is I want the operation to be sync i.e. I want to know if the published happened successfully or not because If the record is not delivered I need to retry publishing.
Any help will be appreciated.

For your first question, which one should I use kafka Template or Kafka producer?
The Kafka Producer is defined in Apache Kafka. The KafkaTemplate is Spring's implementation of it (although it does not implement Producer
directly) and so it provides more methods for you to use.
Read this link::
What is the difference between Kafka Template and kafka producer?
For retry mechanism, in case of failure in publishing.
I have answered this in another question.
The acks parameter control how many partition replicas must receive
the record before the producer can consider the write successful.
There are 3 values for the acks parameter:
acks=0, the producer will not wait for a reply from the broker before
assuming the message sent successfully.
acks=1, the producer will receive a successful response from the
broker the moment the leader replica received the message. If the
message can't be written to the leader, the producer will receive an
error response and can retry.
acks=all, the producer will receive a successful response from the
broker once all in-sync replicas received the message.
Best way to configure retries in Kaka Producer

Eagerly connect to topic as a Kafka Producer

I'm implementing a service which sends messages to a downstream service via a Kafka topic. This will only happen when my service's API is called, which is likely to be infrequently, at least at first.
I've found that the reactive Kafka producer API connects to Kafka lazily, which I'm sure is great for most use-cases, but I'd like to know that my Kafka connection configuration is correct at the point I deploy my service. I don't want to have to wait for the first API call only to find out that something somewhere is wrong.
The solution I've got for this at the moment is simply to send an initialisation message to the topic at start-up, but this feels clunky. Is there anything better I can be doing to force an initial connection to the topic, or at least to validate the connection configuration?
val response = kafkaSender.send(Mono.just(initialise))
.next()
.block();
if (response == null) {
throw new RuntimeException("empty Mono from Kafka initialisation");
}
if (response.exception() != null) {
throw propagate(response.exception());
}

You can use AdminClient API to describe the cluster or the topic you're going to use if you prefer not to be sending heartbeats via a producer

Spring Kafka - Subscribe new topics during runtime

I'm using the annotation #KafkaListener to consume topics on my application. My issue is that if I create a new topic in kafka but my consumer is already running, it seems the consumer will not pick up the new topic, even if it matches with the topicPattern I'm using. Is there a way to "refresh" the subscribed topics periodically, so that new topics are picked up and rebalanced upon my running consumers?
I'm using Spring Kafka 1.2.2 with Kafka 0.10.2.0.
Regards

You can't dynamically add topics at runtime; you have to stop/start the container to start listening to new topics.
You can #Autowire the KafkaListenerEndpointRegistry and stop/start listeners by id.
You can also stop/start all listeners by calling stop()/start() on the registry itself.

Actually it is possible.
It worked for me with Kafka 1.1.1.
Under the hood Spring uses consumer.subscribe(topicPattern) and now it is totally depends on Kafka lib whether the message will be seen by consumer.
There is consumer config property called metadata.max.age.ms which is 5 mins by default. It basically controls how often client will go to broker for the updates, meaning new topics will not be seen by consumer for up to 5 minutes. You can decrease this value (e.g. 20 seconds) and should see KafkaListener started to pick messages from new topics quicker.

The following way works well for me.
ContainerProperties containerProps = new ContainerProperties("topic1", "topic2");
KafkaMessageListenerContainer<Integer, String> container = createContainer(containerProps);
containerProps.setMessageListener(new MessageListener<Integer, String>() {
#Override
public void onMessage(ConsumerRecord<Integer, String> message) {
logger.info("received: " + message);
}
});
container.setBeanName("testAuto");
container.start();
ref: http://docs.spring.io/spring-kafka/docs/1.0.0.RC1/reference/htmlsingle/
In practical application, I use a ConcurrentMessageListenerContainer instead of single-threaded KafkaMessageListenerContainer.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Kafka consumer in test only works with "auto.offset.reset"="earliest" - java

Related

RabbitMQ's receiveAndConvert for clustered environment

Should i create NewTopics in each service spring kafka?

Difference between KafkaTemplate and KafkaProducer send method?

Eagerly connect to topic as a Kafka Producer

Spring Kafka - Subscribe new topics during runtime

Categories

Resources