I have deployed two instances of an application. Both applications runs the same code and consumes from the same topic.
#KafkaListener( offsetReset = OffsetReset.EARLIEST, offsetStrategy = OffsetStrategy.DISABLED )
public class AppConsumer implements ConsumerRebalanceListener, KafkaConsumerAware {
#Topic("topic")
public void consumeAppInfo(#KafkaKey String name, #Body #Nullable String someString) {
...
}
}
I have a problem where only one of the applications consumes the message. The topic has only one partition, partition 0, which i believe is default.
I have tried to add group-id and threads to the KafkaListener. This seems to work sometimes and other time not.
#KafkaListener(groupId="myGroup", threads=10)
What is the simplest solution to getting both applications to consume the same message?
You could not do the group and just give each application a separate consumer id each consumer consumes all messages (unless they are also assigned to a group).
Groups are used for parallel processing of messages each consumer in a group get assigned to a partition for processing messages.
More info => difference between groupid and consumerid in Kafka consumer
In kafka, only one consumer within consumer group is assigned to each partition. If you want to consume the data from the same partition by different applications, you need to specify different consumer groups for each different application / instance.
Related
I have this method implemented in a SpringBoot application
#Scheduled(fixedDelay = 5000)
public void pullMessage() {
MessageDTO message = null;
try {
message = rabbitTemplate.receiveAndConvert(properties.getQueueName(), new ParameterizedTypeReference<MessageDTO>() {});
// more code here...
}
every 5 seconds I'm pulling a message from RabbitMQ and processing something with it. The application is running on Kubernetes and right now I have to duplicate the POD. In this scenario, could the two pods pull the same message?
If the queue is the same for all the instances, then no: only one consumer takes a message from a queue. That's fundamental purpose of the queue pattern at all.
See AMQP docs for publish-subscribe patterns: https://www.rabbitmq.com/tutorials/tutorial-three-java.html
No only a single instance will process the message at one time, the whole purpose of having multiple consumers is not to have any downtime for the application!
Refer the official documentation of RabbitMQ for more clarification!
https://www.rabbitmq.com/tutorials/tutorial-one-java.html
I'm using Kafka for sending messages between services. I use NewTopic bean for configuring number of partitions, for example:
#Bean
fun kafkaTopic(kafkaProperties: KafkaProperties): NewTopic = NewTopic(
kafkaProperties.topics.schedulerCalculationTopic.name,
kafkaProperties.topics.schedulerCalculationTopic.partitions,
1
)
My question is simple, should i add this bean into consumer service and producer service or only in one of them?
I would put it in the producer service and then consider the producer as 'owner' of those topics.
But it get a bit complicated if you have a scenario if you would have several producers to the same topic(s).
If you are not creating the topic on the fly, the best practice is to create topic before reading/writing to it.
Rationale is to prevent brokers to create topic whenever they receive metadata fetch request or consume request with the same topic name. Otherwise, if the consumer starts before the producer, you might end up wrong number of partition. (Broker will create your topic with default number of partitions setting.)
I have multiple instances of my spring boot app consuming from a kafka topic. Since I want all instances to get data from all partitions of this topic, I assigned different consumers groups for each instances which would be created dynamically when starting this application.
#Configuration
#EnableKafka
public class KafkaStreamConfig {
#Bean("provisioningStreamsBuilderFactoryBean")
public StreamsBuilderFactoryBean myStreamsBuilderFactoryBean() {
String myCGName = "MY-CG-" + UUID.randomUUID().toString();
Properties streamsConfiguration = new Properties();
streamsConfiguration.put(APPLICATION_ID_CONFIG, myCGName); // setting consumer group name
// setting other props
StreamsBuilderFactoryBean streamsBuilderFactoryBean = new StreamsBuilderFactoryBean();
streamsBuilderFactoryBean.setStreamsConfiguration(streamsConfiguration);
return streamsBuilderFactoryBean;
}
}
So every time an instance restarts or a new instance is created, a new consumer group is created. And this's the consumer which reads from my topic.
#Component
public class MyConsumer {
#Autowired
private StreamsBuilder streamsBuilder;
#PostConstruct
public void consume() {
final KStream<String, GenericRecord> events = streamsBuilder.stream("my-topic");
events
.selectKey((key, record) -> record.get("id").toString())
.foreach((id, record) -> {
// some computations with the events consumed
});
}
}
Now because of these dynamically created consumer groups stay on, and since they're not used in my application once an instance restarts, these don't consume messages anymore and show a lot of lag and hence give rise to false alerts.
So I'd like to delete these consumer groups when the application shuts down with Kafka's AdminClient api. I was thinking of trying to delete it in a shutdown hook like in a method annotated with #PreDestroy inside MyConsumer class like this:
#PreDestroy
public void destroyMYCG() {
try (AdminClient admin = KafkaAdminClient.create(properties)) {
DeleteConsumerGroupsResult deleteConsumerGroupsResult = admin.deleteConsumerGroups(Collections.singletonList(provGroupName));
KafkaFuture<Void> future = deleteConsumerGroupsResult.all();
future.whenComplete((aVoid, throwable) -> {
System.out.println("EXCEPTION :: " + ExceptionUtils.getStackTrace(throwable));
});
}
System.out.println(getClass().getCanonicalName() + " :: DESTROYING :: " + provGroupName);
}
but I'm getting this exception if I tried that and consumer groups still shows up in the list of consumer groups:
org.apache.kafka.common.errors.TimeoutException: The AdminClient thread is not accepting new calls.
Can someone please help me with this?
Using UUID as the consumer goup name is terrible.You can definition a final str as consumer goup name for each spring boot app.
IMHO this is logical mistake to create consumer group with UUID. Logically if the same process restarts, it is the same app - the same consumer. You will solve your problem giving good consumer groups names related to what logically do the app.
I would delete consumer groups on the server side, having "GC" set on certain level of lag.
Again consumer group is not application id. It is not intended to be randomly created.
And honestly spoken I not sure what kind of problem do you solve doing this.
Because in fact by saying that consumer group is random, you say my code is doing random things and I have no clue what happens in message processing.
We have very complex Kafka message processing and always there is better or worse name for the process, but at least exist one, which is not random.
I'm creating an application that sends messages for time-expensive processing to a consumer using RabbitMQ. However, I need to prioritize messages. When a message with high priority arrives, it must be processed even if all consumer instances are processing other messages.
AFAIK there is no possibility to preempt processing low-priority messages and switch to processing high-priority messages in Spring Boot and RabbitMQ.
Is it possible to create consumers that accept only high-priority messages or to run additional set of consumers on the fly when all other are busy and high-priority messages arrive?
I tried to add queues with x-max-priority=10 flag and to increase number of consumers but it doesn't solve my problem.
Imagine that we run 50 consumers and send 50 messages with low priority. While time-expensive processing is being performed, a new message arrives with high priority but it cannot be processed at once because all 50 consumers are busy.
There is a part of configuration that sets number of consumers
#Bean
public SimpleRabbitListenerContainerFactory
rabbitListenerContainerFactory(SimpleRabbitListenerContainerFactoryConfigurer configurer,
#Qualifier("rabbitConnectionFactory") ConnectionFactory connectionFactory) {
SimpleRabbitListenerContainerFactory factory = new SimpleRabbitListenerContainerFactory();
configurer.configure(factory, connectionFactory);
factory.setConcurrentConsumers(50);
factory.setMaxConcurrentConsumers(100);
return factory;
}
Is there a way to create a set of consumers that accept messages high-priority messages (e.g. higher than 0) or to create consumer on the fly for high-priority messages?
I don't know about a way to implement the preemptive strategy you describe, but there's a number of alternative things that you could consider.
Priority Setting
The first thing to take into account is the priority support in RabbitMQ itself.
Consider this excerpt from RabbitMQ in Depth by Gavin M. Roy:
“As of RabbitMQ 3.5.0, the priority field has been implemented as per the AMQP specification. It’s defined as an integer with possible values of 0 through 9 to be used for message prioritization in queues. As specified, if a message with a priority of 9 is published, and subsequently a message with a priority of 0 is published, a newly connected consumer would receive the message with the priority of 0 before the message with a priority of 9”.
e.g.
rabbitTemplate.convertAndSend("Hello World!", message -> {
MessageProperties properties = MessagePropertiesBuilder.newInstance()
.setPriority(0)
.build();
return MessageBuilder.fromMessage(message)
.andProperties(properties)
.build();
});
Priority-based Exchange
A second alternative is to define a topic exchange and define a routing key that considers your priority.
For example, consider an exchange of events using a routing key of pattern EventName.Priority e.g. OrderPlaced.High, OrderPlaced.Normal or OrderPlaced.Low.
Based on that you could have a queue bound to just orders of high priority i.e. OrderPlaced.High and a number of dedicated consumers just for that queue.
e.g.
String routingKey = String.format("%s.%s", event.name(), event.priority());
rabbit.convertAndSend(routingKey, event);
With a listener like the one below where the queue high-priority-orders is bound to the events exchange for event OrderPlaced and priority High using routing key OrderPlaced.High.
#Component
#RabbitListener(queues = "high-priority-orders", containerFactory="orders")
public class HighPriorityOrdersListener {
#RabbitHandler
public void onOrderPlaced(OrderPlacedEvent orderPlaced) {
//...
}
}
Obviously, you will need a dedicated thread pool (in the orders container factory above) to attend the high priority requests.
There is no mechanism in the AMQP protocol to "select" messages from a queue.
You might want to consider using discrete queues with dedicated consumers instead.
BTW, this is not spring related; general questions about RabbitMQ should be directed to the rabbitmq-users Google group.
I'm using the annotation #KafkaListener to consume topics on my application. My issue is that if I create a new topic in kafka but my consumer is already running, it seems the consumer will not pick up the new topic, even if it matches with the topicPattern I'm using. Is there a way to "refresh" the subscribed topics periodically, so that new topics are picked up and rebalanced upon my running consumers?
I'm using Spring Kafka 1.2.2 with Kafka 0.10.2.0.
Regards
You can't dynamically add topics at runtime; you have to stop/start the container to start listening to new topics.
You can #Autowire the KafkaListenerEndpointRegistry and stop/start listeners by id.
You can also stop/start all listeners by calling stop()/start() on the registry itself.
Actually it is possible.
It worked for me with Kafka 1.1.1.
Under the hood Spring uses consumer.subscribe(topicPattern) and now it is totally depends on Kafka lib whether the message will be seen by consumer.
There is consumer config property called metadata.max.age.ms which is 5 mins by default. It basically controls how often client will go to broker for the updates, meaning new topics will not be seen by consumer for up to 5 minutes. You can decrease this value (e.g. 20 seconds) and should see KafkaListener started to pick messages from new topics quicker.
The following way works well for me.
ContainerProperties containerProps = new ContainerProperties("topic1", "topic2");
KafkaMessageListenerContainer<Integer, String> container = createContainer(containerProps);
containerProps.setMessageListener(new MessageListener<Integer, String>() {
#Override
public void onMessage(ConsumerRecord<Integer, String> message) {
logger.info("received: " + message);
}
});
container.setBeanName("testAuto");
container.start();
ref: http://docs.spring.io/spring-kafka/docs/1.0.0.RC1/reference/htmlsingle/
In practical application, I use a ConcurrentMessageListenerContainer instead of single-threaded KafkaMessageListenerContainer.