(Cleanly?) Consuming Messages in Spring Apache Kafka - java

I've put together a sample application to test out Spring Apache Kafka. I am having success consuming sending the messages through the KafkaTemplate and even able to output the messages being sent from the template using ConsumerRecord so I know the consumer is receiving the data but I'm trying to figure out one thing.
How can I consume the messages without having to throw some extra value in there when creating my consumerRecord bean?
I'm wondering if I'm missing something where I wouldn't have to create this bean at all but in this guide I used I don't understand where ConsumerRecord is used in the listen method and where the method even gets used at all?
In my config class I have created a consumerRecord bean:
#Bean
public ConsumerRecord<String,String> consumerRecord(){
return new ConsumerRecord<>("my-topic", 1, 1L, "key", "value");
}
with essentially dummy properties aside from the topic I specify being my-topic.
I am sending messages through the template to be consumed like so:
void sendMsg(){
ProducerRecord<String,String> producerRecord = new ProducerRecord<>("my-topic", "Sample Data");
try {
latch.await(1000L, TimeUnit.SECONDS);
} catch (InterruptedException e) {
e.printStackTrace();
}
template.send(producerRecord);
listen(consumerRecord);
}
And I'm printing out all the values being sent to the topic to the logger with this method:
private static final String topicName = "my-topic";
#KafkaListener(topics = topicName)
private void listen(ConsumerRecord consumerRecord){
logger.info("Consumer Record Value:::: " + consumerRecord.value());
latch.countDown();
}
My concern is that I don't know what's happening with that "value" I put at the end of the consumer record bean. In a real world application I wouldn't want some random dummy value being consumed so how can I avoid this and just focus on the data being sent through the KafkaTemplate?
I used the Spring Docs as my point of reference when putting together the application (linked above earlier when referencing the guide).
*edit: If anyone knows?
Can you pass the value straight from the producer? But surely it wouldn’t even go into Kafka then?

Related

How does Kafka Schema registration happen in Spring Cloud Stream?

I am trying to understand how to use Spring Cloud Streams with the Kafka Binder.
Currently, I am trying to register an AVRO schema with my Confluent Schema Registry and send messages to a topic.
I am unable to understand how the schema registration is being done by Spring Cloud Streams behind the scenes.
Lets take this example from the Spring Cloud Stream samples.
The AVRO schema is located in src/resources/avro
When the mvn:compile goal is run the POJO for the AVRO schema is generated and the producer can post data.
But what I am not able to understand is how Spring Cloud Stream is doing the schema registration to AVRO ?
#Autowired
StreamBridge streamBridge;
#Bean
public Supplier<Sensor> supplier() {
return () -> {
Sensor sensor = new Sensor();
sensor.setId(UUID.randomUUID().toString() + "-v1");
sensor.setAcceleration(random.nextFloat() * 10);
sensor.setVelocity(random.nextFloat() * 100);
sensor.setTemperature(random.nextFloat() * 50);
return sensor;
};
}
#Bean
public Consumer<Sensor> receiveAndForward() {
return s -> streamBridge.send("sensor-out-0", s);
}
#Bean
Consumer<Sensor> receive() {
return s -> System.out.println("Received Sensor: " + s);
}
Is it done when the beans are created ?
Or is it done when the first message is sent ? If so then how does Spring Stream know where to find the .avsc file from ?
Basically what is happening under the hood ?
There seems to be no mention about this is in the docs.
Thanks.
Your serialization strategy (in this case, AVRO) is always handled in the serializers (for producers) and deserializers (for consumers).
You can have Avro (de)serialized keys and/or Avro (de)serialized values. Which means one should pass in KafkaAvroSerializer.class/KafkaAvroDeserializer.class to the producer/consumer configs, respectively. On top of this, one must pass in the schema.registry.url to the clients config as well.
So behind the scenes, spring cloud stream makes your application avro compatible when it creates your producers/consumers (using the configs found in application.properties or else where). Your clients will connect to the schema registry (logs will tell you if failed to connect) on start up, but does not do any schema registration out of the box.
Schema registration is done on the first message that gets sent. If you haven't already, you'll see that the generated POJOs contain the schemas already, so spring cloud stream doesn't need the .avsc files at all. For example, my last generated Avro pojo contained (line 4) :
#org.apache.avro.specific.AvroGenerated
public class AvroBalanceMessage extends org.apache.avro.specific.SpecificRecordBase implements org.apache.avro.specific.SpecificRecord {
private static final long serialVersionUID = -539731109258473824L;
public static final org.apache.avro.Schema SCHEMA$ = new org.apache.avro.Schema.Parser().parse({\"type\":\"record\",\"name\":\"AvroBalanceMessage\",\"namespace\":\"tech.nermindedovic\",\"fields\"[{\"name\":\"accountNumber\",\"type\":\"long\",\"default\":0},{\"name\":\"routingNumber\",\"type\":\"long\",\"default\":0},{\"name\":\"balance\",\"type\":{\"type\":\"string\",\"avro.java.string\":\"String\"},\"default\":\"0.00\"},{\"name\":\"errors\",\"type\":\"boolean\",\"default\":false}]}");
public static org.apache.avro.Schema getClassSchema() { return SCHEMA$; }
.......
When producers send this pojo, it communicates to the registry about the current version of the schema. If the schema is not in the registry, then the registry will store it and identify it by ID. The producer sends the message with its schema ID to the Kafka broker. On the other hand, the consumer will get this message and check if its seen the ID (stored in cache so you don't always have to retrieve the schema from the Registry) and if it hasnt, it will communicate with the registry to get such information about the message.
A bit outside of the scope of spring cloud stream, but one can also use the REST API for SR to manually register schemas.

How to delete dynamically created consumer groups in spring boot app

I have multiple instances of my spring boot app consuming from a kafka topic. Since I want all instances to get data from all partitions of this topic, I assigned different consumers groups for each instances which would be created dynamically when starting this application.
#Configuration
#EnableKafka
public class KafkaStreamConfig {
#Bean("provisioningStreamsBuilderFactoryBean")
public StreamsBuilderFactoryBean myStreamsBuilderFactoryBean() {
String myCGName = "MY-CG-" + UUID.randomUUID().toString();
Properties streamsConfiguration = new Properties();
streamsConfiguration.put(APPLICATION_ID_CONFIG, myCGName); // setting consumer group name
// setting other props
StreamsBuilderFactoryBean streamsBuilderFactoryBean = new StreamsBuilderFactoryBean();
streamsBuilderFactoryBean.setStreamsConfiguration(streamsConfiguration);
return streamsBuilderFactoryBean;
}
}
So every time an instance restarts or a new instance is created, a new consumer group is created. And this's the consumer which reads from my topic.
#Component
public class MyConsumer {
#Autowired
private StreamsBuilder streamsBuilder;
#PostConstruct
public void consume() {
final KStream<String, GenericRecord> events = streamsBuilder.stream("my-topic");
events
.selectKey((key, record) -> record.get("id").toString())
.foreach((id, record) -> {
// some computations with the events consumed
});
}
}
Now because of these dynamically created consumer groups stay on, and since they're not used in my application once an instance restarts, these don't consume messages anymore and show a lot of lag and hence give rise to false alerts.
So I'd like to delete these consumer groups when the application shuts down with Kafka's AdminClient api. I was thinking of trying to delete it in a shutdown hook like in a method annotated with #PreDestroy inside MyConsumer class like this:
#PreDestroy
public void destroyMYCG() {
try (AdminClient admin = KafkaAdminClient.create(properties)) {
DeleteConsumerGroupsResult deleteConsumerGroupsResult = admin.deleteConsumerGroups(Collections.singletonList(provGroupName));
KafkaFuture<Void> future = deleteConsumerGroupsResult.all();
future.whenComplete((aVoid, throwable) -> {
System.out.println("EXCEPTION :: " + ExceptionUtils.getStackTrace(throwable));
});
}
System.out.println(getClass().getCanonicalName() + " :: DESTROYING :: " + provGroupName);
}
but I'm getting this exception if I tried that and consumer groups still shows up in the list of consumer groups:
org.apache.kafka.common.errors.TimeoutException: The AdminClient thread is not accepting new calls.
Can someone please help me with this?
Using UUID as the consumer goup name is terrible.You can definition a final str as consumer goup name for each spring boot app.
IMHO this is logical mistake to create consumer group with UUID. Logically if the same process restarts, it is the same app - the same consumer. You will solve your problem giving good consumer groups names related to what logically do the app.
I would delete consumer groups on the server side, having "GC" set on certain level of lag.
Again consumer group is not application id. It is not intended to be randomly created.
And honestly spoken I not sure what kind of problem do you solve doing this.
Because in fact by saying that consumer group is random, you say my code is doing random things and I have no clue what happens in message processing.
We have very complex Kafka message processing and always there is better or worse name for the process, but at least exist one, which is not random.

It is possible to return custom data to Kafka Producer

I am learning kafka and I want to split my app to 2 microservices.
First save all incoming messages from KafkaConsumer to database and select entity by given id.
Second provide REST api to save and get entities.
Interaction between them provided with kafka.
How can I receive stored ID from db in REST api with kafka?
Here is sample code of producer which calls on POST request.
public void sendToKafka(MyObject myobject) throws ExecutionException, InterruptedException {
LOGGER.info("sending payload='{}' to topic='{}'", myobject, myTopic);
byte[] bytes = parseObjectToByte(myobject);
ListenableFuture<SendResult<String, byte[]>> resultFuture = kafkaTemplate.send(topicSave, bytes);
SendResult<String, byte[]> result = resultFuture.get();
LOGGER.info(result.toString());
}
and Consumer, which save myObject to database
#KafkaListener(topics = "${kafka.topic.mytopic}")
public void saveMyObject(byte[] value) {
MyObject myobject = parseToMyObject(value);
LOGGER.info("received myobject='{}'", myobject);
MyObject myobjectSaved = myObjectRepository.insert(myobject);
}
I'm using spring-kafka with spring-boot.
Rest api has 2 methods:
POST - save myObject
Get - return saved object by id.
It is possible do it with kafka or i must connect this microservices directly? Thank you.
Not sure I fully understand your question, but if you want to send a message to kafka, and wait for this message to be consumed and handled by some microservice which would then return some information (a primary key) to the sender of the message you cannot do that without adding more to your architecture.
A message sent to kafka is "fire and forget" and from the sender point of view you know nothing about what will happen with this message (if, when, how often, and by how many consumers it will be consumed.)
In your scenario, the consumer microservice could also send messages with the primary key in another kafka topic, that you would consume if you need that information.
Remember Kafka serves to decouple your architecture and to introduce asynchronous message handling, if you need to synchronously have a response from a consumer you are probably using the wrong solution.

Using Spring Cloud Stream Source to send method results to stream

I'm trying to create a Spring Cloud Stream Source Bean inside a Spring Boot Application that simply sends the results of a method to a stream (underlying Kafka topic is bound to the stream).
Most of the Stream samples I've seen use #InboundChannelAdapter annotation to send data to the stream using a poller. But I don't want to use a poller. I've tried setting the poller to an empty array but the other problem is that when using #InboundChannelAdapter you are unable to have any method parameters.
The overall concept of what I am trying to do is read from an inbound stream. Do some async processing, then post the result to an outbound stream. So using a processor doesn't seem to be an option either. I am using #StreamListener with a Sink channel to read the inbound stream and that works.
Here is some code i've been trying but this doesn't work at all. I was hoping it would be this simple because my Sink was but maybe it isn't. Looking for someone to point me to an example of a source that isn't a Processor (i.e. doesn't require listening on an inbound channel) and doesn't use #InboundChannelAdapter or to give me some design tips to accomplish what I need to do in a different way. Thanks!
#EnableBinding(Source.class)
public class JobForwarder {
#ServiceActivator(outputChannel = Source.OUTPUT)
#SendTo(Source.OUTPUT)
public String forwardJob(String message) {
log.info(String.format("Forwarding a job message [%s] to queue [%s]", message, Source.OUTPUT));
return message;
}
}
Your orginal requirement can be achieved through the below steps.
Create your custom Bound Interface (you can use the default #EnableBinding(Source.class) as well)
public interface CustomSource {
String OUTPUT = "customoutput";
#Output(CustomSource.OUTPUT)
MessageChannel output();
}
Inject your bound channel
#Component
#EnableBinding(CustomSource.class)
public class CustomOutputEventSource {
#Autowired
private CustomSource customSource;
public void sendMessage(String message) {
customSource.output().send(MessageBuilder.withPayload(message).build());
}
}
Test it
#RunWith(SpringRunner.class)
#SpringBootTest
public class CustomOutputEventSourceTest {
#Autowired
CustomOutputEventSource output;
#Test
public void sendMessage() {
output.sendMessage("Test message from JUnit test");
}
}
So if you don't want to use a Poller, what causes the forwardJob() method to be called?
You can't just call the method and expect the result to go to the output channel.
With your current configuration, you need an inputChannel on the service containing your inbound message (and something to send a message to that channel). It doesn't have to be bound to a transport; it can be a simple MessageChannel #Bean.
Or, you could use a #Publisher to publish the result of the method invocation (as well as being returned to the caller) - docs here.
#Publisher(channel = Source.OUTPUT)
Thanks for the input. It took me a while to get back to the problem. I did try reading the documentation for #Publisher. It looked to be exactly what I needed but I just couldn't get the proper beans initialized to get it wired properly.
To answer your question the forwardJob() method is called after some async processing of the input.
Eventually I just implemented using spring-kafka library directly and that was much more explicit and felt easier to get going. I think we are going to stick to kafka as the only channel binding so I think we'll stick with that library.
However, we did eventually get the spring-cloud-stream library working quite simply. Here was the code for a single source without a poller.
#Component
#EnableBinding(Source.class)
public class JobForwarder {
private Source source;
#Autowired
public ScheduledJobForwarder(Source source) {
this.source = source;
}
public void forwardScheduledJob(String message) {
log.info(String.format("Forwarding a job message [%s] to queue [%s]", message, Source.OUTPUT));
source.output().send(MessageBuilder.withPayload(message).build());
}
}

How to work with rabbitTemplate receiveAndReply

I have just started experimenting with Spring and rabbitMQ.
I would like to create a microsevice infrastructure with rabbit and spring,
I have been following Spring boot tutorial
But it is very simplistic. As well I am looking at the documentation (springs, Rabbit) for how to create an RPC, i understand the Rabbits approach, but i would like to leverage Spring template to save me the boilerplate.
I just cant seem to understand where to register the reciveAndReplay callback at.
I tried doing this:
sending
System.out.println("Sending message...");
Object convertSendAndReceive = rabbitTemplate.convertSendAndReceive("spring-boot", "send and recive: sent");
System.out.println("GOT " + convertSendAndReceive); //is null
receiving
#Component
public class Receiver {
#Autowired
RabbitTemplate rabbitTemplate;
public void receiveMessage(String message) {
this.rabbitTemplate.receiveAndReply("spring-boot", (Message)->{
return "return this statement";
});
}
}
But its not a big surprise this doesn't work the message is received but nothing comes back. I assume that this needs to be registered somewhere in the factory/template at the bean creation level but i don't seem to understand where and sadly the documentation is unclear.
First, please use the Spring AMQP Documentation.
You would generally use a SimpleMessageListenerContainer wired with a POJO listener for RPC.
The template receiveAndReply method is intended for "scheduled" server-side RPC - i.e. only receive (and reply) when you want to, rather than whenever a message arrives in the queue. It does not block waiting for a message.
If you want to use receiveAndReply(), there's a test case that illustrates it.
EDIT:
This code...
this.template.convertAndSend(ROUTE, "test");
sends a message to the queue.
This code...
this.template.setQueue(ROUTE);
boolean received = this.template.receiveAndReply(new ReceiveAndReplyMessageCallback() {
#Override
public Message handle(Message message) {
message.getMessageProperties().setHeader("foo", "bar");
return message;
}
});
Receives a message and from that queue; adds a header and returns the same messsage to the reply queue. received will be false if there was no message to receive (and reply to).
This code:
Message receive = this.template.receive();
receives the reply.
This test is a bit contrived because the reply is sent to the same queue as the request. We can't use sendAndReceive() on the client side in this test because the thread would block waiting for the reply (and we need to execute the receiveAndReply()).
Another test in that class has a more realistic example where it does the sendAndReceive()s on different threads and the receiveAndReply()s on the main thread.
Note that that test uses a listener container on the client side for replies; that is generally no longer needed since the rabbit broker now supports direct reply-to.
receiveAndReply() was added for symmetry - in most cases, people use a listener container and listener adapter for server-side RPC.

Categories