Spring Kafka, Spring Cloud Stream, and Avro compatibility Unknown magic byte - java

I have a problem deserializing messages from Kafka topics. The messages have been serialized using spring-cloud-stream and Apache Avro. I am reading them using Spring Kafka and trying to deserialise them. If I use spring-cloud to both produce and consume the messages, then I can deserialize the messages fine. The problem is when I consume them with Spring Kafka and then try to deserialize.
I am using a Schema Registry (both the spring-boot Schema Registry for development, and also a Confluent schema in production), but the deserialization problems seem to occur before event calling the Schema Registry.
Its hard to post all the relevant code on this question, so I have posted it in a repo in git hub: https://github.com/robjwilkins/avro-example
The object I am sending over the topic is just a simple pojo:
#Data
public class Request {
private String message;
}
The code which produces messages on Kafka looks like this:
#EnableBinding(MessageChannels.class)
#Slf4j
#RequiredArgsConstructor
#RestController
public class ProducerController {
private final MessageChannels messageChannels;
#GetMapping("/produce")
public void produceMessage() {
Request request = new Request();
request.setMessage("hello world");
Message<Request> requestMessage = MessageBuilder.withPayload(request).build();
log.debug("sending message");
messageChannels.testRequest().send(requestMessage);
}
}
and application.yaml:
spring:
application.name: avro-producer
kafka:
bootstrap-servers: localhost:9092
consumer.group-id: avro-producer
cloud:
stream:
schema-registry-client.endpoint: http://localhost:8071
schema.avro.dynamic-schema-generation-enabled: true
kafka:
binder:
brokers: ${spring.kafka.bootstrap-servers}
bindings:
test-request:
destination: test-request
contentType: application/*+avro
Then I have a consumer:
#Slf4j
#Component
public class TopicListener {
#KafkaListener(topics = {"test-request"})
public void listenForMessage(ConsumerRecord<String, Request> consumerRecord) {
log.info("listenForMessage. got a message: {}", consumerRecord);
consumerRecord.headers().forEach(header -> log.info("header. key: {}, value: {}", header.key(), asString(header.value())));
}
private String asString(byte[] byteArray) {
return new String(byteArray, Charset.defaultCharset());
}
}
And the project which consumes has application.yaml config:
spring:
application.name: avro-consumer
kafka:
bootstrap-servers: localhost:9092
consumer:
group-id: avro-consumer
value-deserializer: io.confluent.kafka.serializers.KafkaAvroDeserializer
# value-deserializer: org.apache.kafka.common.serialization.StringDeserializer
key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
properties:
schema.registry.url: http://localhost:8071
When the consumer gets a message it results in an exception:
2019-01-30 20:01:39.900 ERROR 30876 --- [ntainer#0-0-C-1] o.s.kafka.listener.LoggingErrorHandler : Error while processing: null
org.apache.kafka.common.errors.SerializationException: Error deserializing key/value for partition test-request-0 at offset 43. If needed, please seek past the record to continue consumption.
Caused by: org.apache.kafka.common.errors.SerializationException: Error deserializing Avro message for id -1
Caused by: org.apache.kafka.common.errors.SerializationException: Unknown magic byte!
I have stepped through the deserialization code to the point where this exception is thrown
public abstract class AbstractKafkaAvroDeserializer extends AbstractKafkaAvroSerDe {
....
private ByteBuffer getByteBuffer(byte[] payload) {
ByteBuffer buffer = ByteBuffer.wrap(payload);
if (buffer.get() != 0) {
throw new SerializationException("Unknown magic byte!");
} else {
return buffer;
}
}
It is happening because the deserializer checks byte contents of the serialized object (byte array) and expects it to be 0, however it is not. Hence the reason I question whether the spring-cloud-stream MessageConverter which serialized the object is compatible with the io.confluent object which I am using to deserialize the object. And if they are not compatible, what do I do?
thanks for any help.

The crux of this problem is that the producer is using spring-cloud-stream to post messages to Kafka, but the consumer uses spring-kaka. The reasons for this are:
The existing system is already well established and uses spring-cloud-stream
A new consumer is required to listen to multiple topics using the same method, binding only on a csv list of topic names
There is a requirement to consume a collection of messages at once, rather than individually, so their contents can be written in bulk to a database.
Spring-cloud-stream doesn't current allow the consumer to bind a listener to multiple topics, and there is no way to consume a collection of messages at once (unless I'm mistaken).
I have found a solution which doesn't require any changes to the producer code which uses spring-cloud-stream to publish messages to Kafka. Spring-cloud-stream uses a MessageConverter to manage serialisation and deserialisation. In the AbstractAvroMessageConverter there are methods: convertFromInternal and convertToInternal which handle the transformation to/from a byte array. My solution was to extend this code (creating a class which extends AvroSchemaRegistryClientMessageConverter), so I could reuse much of the spring-cloud-stream functionality, but with an interface that can be accessed from my spring-kafka KafkaListener. I then amended my TopicListener to use this class to do the conversion:
The converter:
#Component
#Slf4j
public class AvroKafkaMessageConverter extends AvroSchemaRegistryClientMessageConverter {
public AvroKafkaMessageConverter(SchemaRegistryClient schemaRegistryClient) {
super(schemaRegistryClient, new NoOpCacheManager());
}
public <T> T convertFromInternal(ConsumerRecord<?, ?> consumerRecord, Class<T> targetClass,
Object conversionHint) {
T result;
try {
byte[] payload = (byte[]) consumerRecord.value();
Map<String, String> headers = new HashMap<>();
consumerRecord.headers().forEach(header -> headers.put(header.key(), asString(header.value())));
MimeType mimeType = messageMimeType(conversionHint, headers);
if (mimeType == null) {
return null;
}
Schema writerSchema = resolveWriterSchemaForDeserialization(mimeType);
Schema readerSchema = resolveReaderSchemaForDeserialization(targetClass);
#SuppressWarnings("unchecked")
DatumReader<Object> reader = getDatumReader((Class<Object>) targetClass, readerSchema, writerSchema);
Decoder decoder = DecoderFactory.get().binaryDecoder(payload, null);
result = (T) reader.read(null, decoder);
}
catch (IOException e) {
throw new RuntimeException("Failed to read payload", e);
}
return result;
}
private MimeType messageMimeType(Object conversionHint, Map<String, String> headers) {
MimeType mimeType;
try {
String contentType = headers.get(MessageHeaders.CONTENT_TYPE);
log.debug("contentType: {}", contentType);
mimeType = MimeType.valueOf(contentType);
} catch (InvalidMimeTypeException e) {
log.error("Exception getting object MimeType from contentType header", e);
if (conversionHint instanceof MimeType) {
mimeType = (MimeType) conversionHint;
}
else {
return null;
}
}
return mimeType;
}
private String asString(byte[] byteArray) {
String theString = new String(byteArray, Charset.defaultCharset());
return theString.replace("\"", "");
}
}
The amended TopicListener:
#Slf4j
#Component
#RequiredArgsConstructor
public class TopicListener {
private final AvroKafkaMessageConverter messageConverter;
#KafkaListener(topics = {"test-request"})
public void listenForMessage(ConsumerRecord<?, ?> consumerRecord) {
log.info("listenForMessage. got a message: {}", consumerRecord);
Request request = messageConverter.convertFromInternal(
consumerRecord, Request.class, MimeType.valueOf("application/vnd.*+avr"));
log.info("request message: {}", request.getMessage());
}
}
This solution only consumes one message at a time but can be easily modified to consume batches of messages.
The full solution is here: https://github.com/robjwilkins/avro-example/tree/develop

You should to define deserializer explicitly, by creating DefaultKafkaConsumerFactory and your TopicListener bean in a config, something like this:
#Configuration
#EnableKafka
public class TopicListenerConfig {
#Value("${spring.kafka.bootstrap-servers}")
private String bootstrapServers;
#Value(("${spring.kafka.consumer.group-id}"))
private String groupId;
#Bean
public Map<String, Object> consumerConfigs() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, JsonDeserializer.class);
props.put(ConsumerConfig.GROUP_ID_CONFIG, groupId);
props.put(JsonDeserializer.TRUSTED_PACKAGES, "com.wilkins.avro.consumer");
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
return props;
}
#Bean
public ConsumerFactory<String, String> consumerFactory() {
return new DefaultKafkaConsumerFactory<>(consumerConfigs());
}
#Bean
public KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<String, String>> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
return factory;
}
#Bean
public TopicListener topicListener() {
return new TopicListener();
}
}

You can configure the binding to use a Kafka Serializer natively instead.
Set the producer property useNativeEncoding to true and configure the serializer using the ...producer.configuration Kafka properties.
EDIT
Example:
spring:
cloud:
stream:
# Generic binding properties
bindings:
input:
consumer:
use-native-decoding: true
destination: so54448732
group: so54448732
output:
destination: so54448732
producer:
use-native-encoding: true
# Kafka-specific binding properties
kafka:
bindings:
input:
consumer:
configuration:
value.deserializer: com.example.FooDeserializer
output:
producer:
configuration:
value.serializer: com.example.FooSerializer

Thanks this has saved my day using nativeencoding and spring:
cloud:
stream:
Generic binding properties
bindings:
input:
consumer:
use-native-decoding: true
destination: so54448732
group: so54448732
output:
destination: so54448732
producer:
use-native-encoding: true
Kafka-specific binding properties
kafka:
bindings:
input:
consumer:
configuration:
value.deserializer: com.example.FooDeserializer
output:
producer:
configuration:
value.serializer: com.example.FooSerializer

Related

How to test EmbeddedKafka with SpringBoot

I met problem with testing Kafka Producer after change custom Producer to KafkaTemplate.
For tests reason I wrote next class:
public class KafkaTestingTools {
static private Map<String, Consumer<Long, GenericData.Record>> consumers = new HashMap<>();
static public void sendMessage (String topic, String key, Object message, Schema schema) throws InterruptedException{
Properties properties = new Properties();
properties.put("schema.registry.url", "http://localhost:8081");
properties.put("bootstrap.servers", "http://localhost:9092");
properties.put("acks", "all");
properties.put("retries", 0);
properties.put("linger.ms", 1);
properties.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
properties.put("value.serializer", "com.logistics.mock.CustomKafkaAvroSerializer");
KafkaProducer<String, Object> producer = new KafkaProducer<>(properties);
CustomKafkaAvroDeserializer.setTopicScheme(topic, schema);
ProducerRecord<String, Object> record = new ProducerRecord<>(topic, key, message);
producer.send(record);
producer.close();
}
static public void registerConsumerContainer(EmbeddedKafkaBroker embeddedKafka, String topic, Schema schema) throws InterruptedException{
Map<String, Object> consumerProps = KafkaTestUtils.consumerProps("testGroup" + UUID.randomUUID().toString(), "true", embeddedKafka);
consumerProps.put("schema.registry.url", "http://localhost:8081");
consumerProps.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
consumerProps.put("value.deserializer", "com.logistics.mock.CustomKafkaAvroDeserializer");
consumerProps.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest");
ConsumerFactory<Long, GenericData.Record> cf = new DefaultKafkaConsumerFactory<>(consumerProps);
Consumer<Long, GenericData.Record> consumer = cf.createConsumer();
consumers.put(topic, consumer);
embeddedKafka.consumeFromAnEmbeddedTopic(consumer, topic);
CustomKafkaAvroDeserializer.setTopicScheme(topic, schema);
}
static public Object getSingleRecordFromRegisteredContainer(EmbeddedKafkaBroker embeddedKafka, String topic){
return SpecificData.get().deepCopy(
CustomKafkaAvroDeserializer.getTopicScheme(topic),
KafkaTestUtils.getSingleRecord(consumers.get(topic), topic).value()
);
}
}
Producer example:
#Service
#CommonsLog
public class PointProducer {
private final KafkaTemplate<String, ExportMessage> kafkaTemplate;
private final String topic;
#Autowired
public PointProducer(#Value("${kafka.producer.points}") String topic,
KafkaTemplate<String, ExportMessage> kafkaTemplate) {
this.topic = topic;
this.kafkaTemplate = kafkaTemplate;
}
public void produce(Point point) {
var message = new ExportMessage();
message.setId(point.getId());
log.warn("produce point: " + message.toString());
kafkaTemplate.send(topic, point.getId().toString(), message);
kafkaTemplate.flush();
}
kafka config
spring:
kafka:
bootstrap-servers: ${spring.embedded.kafka.brokers}
consumer:
key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
point-deserializer: com.logistics.mock.CustomKafkaAvroDeserializer
auto-offset-reset: latest
group-id: credit_file_test
producer:
key-serializer: org.apache.kafka.common.serialization.StringSerializer
value-serializer: com.logistics.mock.CustomKafkaAvroSerializer
schema-registry-url: http://localhost:8081
kafka.consumer.points: points_export
kafka.producer.files: common.file
kafka.producer.orders: common.order
kafka.producer.points: common.point
And tests looks like:
#SpringBootTest
#TestMethodOrder(OrderAnnotation.class)
#EmbeddedKafka(partitions = 1, topics = { "topic1", "topic2" }, brokerProperties = { "listeners=PLAINTEXT://localhost:9092", "port=9092" })
class ApplicationLogisticOrderTest {
#Test
#Order(1)
#WithMockUser(roles = "ADMIN")
void checkSendToKafka() throws Exception {
KafkaTestingTools.registerConsumerContainer(this.embeddedKafka, TOPIC1, Message.SCHEMA$);
Thread.sleep(3000);
prepareCustomizedLogisticOrder(t -> {
});
var mockMvc = MockMvcBuilders.webAppContextSetup(this.wac).build();
mockMvc.perform(MockMvcRequestBuilders.put("/orders/7000000/sendToKafka"));
}
And on line with perform I caught:
Caused by: org.apache.kafka.common.config.ConfigException: Missing required configuration "schema.registry.url" which has no default value.
at org.apache.kafka.common.config.ConfigDef.parseValue(ConfigDef.java:478)
at org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:468)
at org.apache.kafka.common.config.AbstractConfig.<init>(AbstractConfig.java:108)
at org.apache.kafka.common.config.AbstractConfig.<init>(AbstractConfig.java:129)
at io.confluent.kafka.serializers.AbstractKafkaSchemaSerDeConfig.<init>(AbstractKafkaSchemaSerDeConfig.java:177)
at io.confluent.kafka.serializers.KafkaAvroSerializerConfig.<init>(KafkaAvroSerializerConfig.java:32)
at io.confluent.kafka.serializers.KafkaAvroSerializer.configure(KafkaAvroSerializer.java:50)
at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:376)
I tried to put it in application.yml, in KafkaTestingTools properties, but nothing changed, it looks like Spring looks for this property in another place.
Maybe someone met this situation and know solution?
Thanks in advance.
The problem is here:
spring:
kafka:
schema-registry-url: http://localhost:8081
There is no such a property managed by Spring Boot.
More over this schema-registry-url doesn't fit to that schema.registry.url.
You have to consider to change it into this:
spring:
kafka:
producer:
properties:
"schema.registry.url": http://localhost:8081
See docs for more info: https://docs.spring.io/spring-boot/docs/current/reference/html/features.html#features.messaging.kafka.additional-properties

Problem creating tests with Spring cloud streams kafka streams using embeddedKafka with MockSchemaRegistryClient

I'm trying to figure out how i can test my Spring Cloud Streams Kafka-Streams application.
The application lookls like this:
Stream 1: Topic1 > Topic2
Stream 2: Topic2 + Topic3 joined > Topic4
Stream 3: Topic4 > Topic5
I tried different approaches like the TestChannelBinder but this approach only works with Simple functions not Streams and Avro.
I decided to use EmbeddedKafka with MockSchemaRegistryClient. I can produce to a topic and also consume from the same topic again (topic1) but i'm not able to consume from (topic2).
In my test application.yaml i put the following configuration (i'm only testing the first stream for now, i want to extend it once this works):
spring.application.name: processingapp
spring.cloud:
function.definition: stream1 # not now ;stream2;stream3
stream:
bindings:
stream1-in-0:
destination: topic1
stream1-out-0:
destination: topic2
kafka:
binder:
min-partition-count: 1
replication-factor: 1
auto-create-topics: true
auto-add-partitions: true
bindings:
default:
consumer:
autoRebalanceEnabled: true
resetOffsets: true
startOffset: earliest
stream1-in-0:
consumer:
keySerde: io.confluent.kafka.streams.serdes.avro.PrimitiveAvroSerde
valueSerde: io.confluent.kafka.streams.serdes.avro.SpecificAvroSerde
stream1-out-0:
producer:
keySerde: io.confluent.kafka.streams.serdes.avro.PrimitiveAvroSerde
valueSerde: io.confluent.kafka.streams.serdes.avro.SpecificAvroSerde
streams:
binder:
configuration:
schema.registry.url: mock://localtest
specivic.avro.reader: true
My test looks like the following:
#RunWith(SpringRunner.class)
#SpringBootTest
public class Test {
private static final String INPUT_TOPIC = "topic1";
private static final String OUTPUT_TOPIC = "topic2";
#ClassRule
public static EmbeddedKafkaRule embeddedKafka = new EmbeddedKafkaRule(1, true, 1, INPUT_TOPIC, OUTPUT_TOPIC);
#BeforeClass
public static void setup() {
System.setProperty("spring.cloud.stream.kafka.binder.brokers", embeddedKafka.getEmbeddedKafka().getBrokersAsString());
}
#org.junit.Test
public void testSendReceive() throws IOException {
Map<String, Object> senderProps = KafkaTestUtils.producerProps(embeddedKafka.getEmbeddedKafka());
senderProps.put("key.serializer", LongSerializer.class);
senderProps.put("value.serializer", SpecificAvroSerializer.class);
senderProps.put("schema.registry.url", "mock://localtest");
AvroFileParser fileParser = new AvroFileParser();
DefaultKafkaProducerFactory<Long, Test1> pf = new DefaultKafkaProducerFactory<>(senderProps);
KafkaTemplate<Long, Test1> template = new KafkaTemplate<>(pf, true);
Test1 test1 = fileParser.parseTest1("src/test/resources/mocks/test1.json");
template.send(INPUT_TOPIC, 123456L, test1);
System.out.println("produced");
Map<String, Object> consumer1Props = KafkaTestUtils.consumerProps("testConsumer1", "false", embeddedKafka.getEmbeddedKafka());
consumer1Props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
consumer1Props.put("key.deserializer", LongDeserializer.class);
consumer1Props.put("value.deserializer", SpecificAvroDeserializer.class);
consumer1Props.put("schema.registry.url", "mock://localtest");
DefaultKafkaConsumerFactory<Long, Test1> cf = new DefaultKafkaConsumerFactory<>(consumer1Props);
Consumer<Long, Test1> consumer1 = cf.createConsumer();
consumer1.subscribe(Collections.singleton(INPUT_TOPIC));
ConsumerRecords<Long, Test1> records = consumer1.poll(Duration.ofSeconds(10));
consumer1.commitSync();
System.out.println("records count?");
System.out.println("" + records.count());
Test1 fetchedTest1;
fetchedTest1 = records.iterator().next().value();
assertThat(records.count()).isEqualTo(1);
System.out.println("found record");
System.out.println(fetchedTest1.toString());
Map<String, Object> consumer2Props = KafkaTestUtils.consumerProps("testConsumer2", "false", embeddedKafka.getEmbeddedKafka());
consumer2Props.put("key.deserializer", StringDeserializer.class);
consumer2Props.put("value.deserializer", TestAvroDeserializer.class);
consumer2Props.put("schema.registry.url", "mock://localtest");
DefaultKafkaConsumerFactory<String, Test2> consumer2Factory = new DefaultKafkaConsumerFactory<>(consumer2Props);
Consumer<String, Test2> consumer2 = consumer2Factory.createConsumer();
consumer2.subscribe(Collections.singleton(OUTPUT_TOPIC));
ConsumerRecords<String, Test2> records2 = consumer2.poll(Duration.ofSeconds(30));
consumer2.commitSync();
if (records2.iterator().hasNext()) {
System.out.println("has next");
} else {
System.out.println("has no next");
}
}
}
I receive the following exception when trying to consume and deserialize from topic2:
Caused by: org.apache.kafka.common.errors.SerializationException: Error retrieving Avro unknown schema for id 0
Caused by: java.io.IOException: Cannot get schema from schema registry!
at io.confluent.kafka.schemaregistry.client.MockSchemaRegistryClient.getSchemaBySubjectAndIdFromRegistry(MockSchemaRegistryClient.java:193) ~[kafka-schema-registry-client-6.2.0.jar:na]
at io.confluent.kafka.schemaregistry.client.MockSchemaRegistryClient.getSchemaBySubjectAndId(MockSchemaRegistryClient.java:249) ~[kafka-schema-registry-client-6.2.0.jar:na]
at io.confluent.kafka.schemaregistry.client.MockSchemaRegistryClient.getSchemaById(MockSchemaRegistryClient.java:232) ~[kafka-schema-registry-client-6.2.0.jar:na]
at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer$DeserializationContext.schemaFromRegistry(AbstractKafkaAvroDeserializer.java:307) ~[kafka-avro-serializer-6.2.0.jar:na]
at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserialize(AbstractKafkaAvroDeserializer.java:107) ~[kafka-avro-serializer-6.2.0.jar:na]
at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserialize(AbstractKafkaAvroDeserializer.java:86) ~[kafka-avro-serializer-6.2.0.jar:na]
at io.confluent.kafka.serializers.KafkaAvroDeserializer.deserialize(KafkaAvroDeserializer.java:55) ~[kafka-avro-serializer-6.2.0.jar:na]
at org.apache.kafka.common.serialization.Deserializer.deserialize(Deserializer.java:60) ~[kafka-clients-2.7.1.jar:na]
at org.apache.kafka.streams.processor.internals.SourceNode.deserializeKey(SourceNode.java:54) ~[kafka-streams-2.7.1.jar:na]
at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:65) ~[kafka-streams-2.7.1.jar:na]
at org.apache.kafka.streams.processor.internals.RecordQueue.updateHead(RecordQueue.java:176) ~[kafka-streams-2.7.1.jar:na]
at org.apache.kafka.streams.processor.internals.RecordQueue.addRawRecords(RecordQueue.java:112) ~[kafka-streams-2.7.1.jar:na]
at org.apache.kafka.streams.processor.internals.PartitionGroup.addRawRecords(PartitionGroup.java:185) ~[kafka-streams-2.7.1.jar:na]
at org.apache.kafka.streams.processor.internals.StreamTask.addRecords(StreamTask.java:895) ~[kafka-streams-2.7.1.jar:na]
at org.apache.kafka.streams.processor.internals.TaskManager.addRecordsToTasks(TaskManager.java:1008) ~[kafka-streams-2.7.1.jar:na]
at org.apache.kafka.streams.processor.internals.StreamThread.pollPhase(StreamThread.java:812) ~[kafka-streams-2.7.1.jar:na]
at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:625) ~[kafka-streams-2.7.1.jar:na]
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:564) ~[kafka-streams-2.7.1.jar:na]
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:523) ~[kafka-streams-2.7.1.jar:na]
There won't be a message consumed.
So i tried to overwrite the SpecificAvroSerde and register the schemas directly and use this deserializer.
public class TestAvroDeserializer<T extends org.apache.avro.specific.SpecificRecord>
extends SpecificAvroDeserializer<T> implements Deserializer<T> {
private final KafkaAvroDeserializer inner;
public TestAvroDeserializer() throws IOException, RestClientException {
MockSchemaRegistryClient mockedClient = new MockSchemaRegistryClient();
Schema.Parser parser = new Schema.Parser();
Schema test2Schema = parser.parse(new File("./src/main/resources/avro/test2.avsc"));
mockedClient.register("test2-value", test2Schema , 1, 0);
inner = new KafkaAvroDeserializer(mockedClient);
}
/**
* For testing purposes only.
*/
TestAvroDeserializer(final SchemaRegistryClient client) throws IOException, RestClientException {
MockSchemaRegistryClient mockedClient = new MockSchemaRegistryClient();
Schema.Parser parser = new Schema.Parser();
Schema test2Schema = parser.parse(new File("./src/main/resources/avro/test2.avsc"));
mockedClient.register("test2-value", test2Schema , 1, 0);
inner = new KafkaAvroDeserializer(mockedClient);
}
}
With this deserializer it won't work too. Does anyone have experience on how to do this tests with EmbeddedKafka and MockSchemaRegistry? Or is there another approach i should use?
I'm very glad if someone can help. Thank you in advance.
I found an appropriate way of integration testing my topology.
I use the TopologyTestDriver from the kafka-streams-test-utils package.
Include this dependency to Maven:
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-streams-test-utils</artifactId>
<scope>test</scope>
</dependency>
For the application described in the question setting up the TopologyTestDriver would look like the following. This code is just sequentially to show how it works.
#Test
void test() {
keySerde.configure(Map.of(KafkaAvroSerializerConfig.SCHEMA_REGISTRY_URL_CONFIG, "mock://schemas"), true);
valueSerdeTopic1.configure(Map.of(KafkaAvroSerializerConfig.SCHEMA_REGISTRY_URL_CONFIG, "mock://schemas"), false);
valueSerdeTopic2.configure(Map.of(KafkaAvroSerializerConfig.SCHEMA_REGISTRY_URL_CONFIG, "mock://schemas"), false);
final StreamsBuilder builder = new StreamsBuilder();
Configuration config = new Configuration(); // class where you declare your spring cloud stream functions
KStream<String, Topic1> input = builder.stream("topic1", Consumed.with(keySerde, valueSerdeTopic1));
KStream<String, Topic2> output = config.stream1().apply(input);
output.to("topic2");
Topology topology = builder.build();
Properties streamsConfig = new Properties();
streamsConfig.putAll(Map.of(
org.apache.kafka.streams.StreamsConfig.APPLICATION_ID_CONFIG, "toplogy-test-driver",
org.apache.kafka.streams.StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "ignored",
KafkaAvroSerializerConfig.SCHEMA_REGISTRY_URL_CONFIG, "mock://schemas",
org.apache.kafka.streams.StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, PrimitiveAvroSerde.class.getName(),
org.apache.kafka.streams.StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, SpecificAvroSerde.class.getName()
));
TopologyTestDriver testDriver = new TopologyTestDriver(topology, streamsConfig);
TestInputTopic<String, Topic1> inputTopic = testDriver.createInputTopic("topic1", keySerde.serializer(), valueSerdeTopic1.serializer());
TestOutputTopic<String, Topic2> outputTopic = testDriver.createOutputTopic("topic2", keySerde.deserializer(), valueSerdeTopic2.deserializer());
inputTopic.pipeInput("key", topic1AvroModel); // Write to the input topic which applies the topology processor of your spring-cloud-stream app
KeyValue<String, Topic2> outputRecord = outputTopic.readKeyValue(); // Read from the output topic
}
If you write more tests i recommend to abstract the setup code to not repeat yourself for each test.
I highly suggest this example from the spring-cloud-streams-samples repository, it leaded me to the solution to use TopologyTestDriver.

Kafka consumer unit test with Avro Schema registry failing

I'm writing a consumer which listens to a Kafka topic and consumes message whenever message is available. I've tested the logic/code by running Kafka locally and it's working fine.
While writing the unit/component test cases, it's failing with avro schema registry url error. I've tried different options available on internet but could not find anything working. I am not sure if my approach is even correct. Please help.
Listener Class
#KafkaListener(topics = "positionmgmt.v1", containerFactory = "genericKafkaListenerFactory")
public void receive(ConsumerRecord<String, GenericRecord> consumerRecord) {
try {
GenericRecord generic = consumerRecord.value();
Object obj = generic.get("metadata");
ObjectMapper mapper = new ObjectMapper();
Header headerMetaData = mapper.readValue(obj.toString(), Header.class);
System.out.println("Received payload : " + consumerRecord.value());
//Call backend with details in GenericRecord
}catch (Exception e){
System.out.println("Exception while reading message from Kafka " + e );
}
Kafka config
#Bean
public ConcurrentKafkaListenerContainerFactory<String, GenericRecord> genericKafkaListenerFactory() {
ConcurrentKafkaListenerContainerFactory<String, GenericRecord> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(genericConsumerFactory());
return factory;
}
public ConsumerFactory<String, GenericRecord> genericConsumerFactory() {
Map<String, Object> config = new HashMap<>();
config.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "127.0.0.1:9092");
config.put(ConsumerConfig.GROUP_ID_CONFIG, "group_id");
config.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
config.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, KafkaAvroDeserializer.class);
config.put(KafkaAvroDeserializerConfig.SCHEMA_REGISTRY_URL_CONFIG,"http://localhost:8081");
return new DefaultKafkaConsumerFactory<>(config);
}
Avro Schema
{
"type":"record",
"name":"KafkaEvent",
"namespace":"com.ms.model.avro",
"fields":[
{
"name":"metadata",
"type":{
"name":"metadata",
"type":"record",
"fields":[
{
"name":"correlationid",
"type":"string",
"doc":"this is corrleation id for transaction"
},
{
"name":"subject",
"type":"string",
"doc":"this is subject for transaction"
},
{
"name":"version",
"type":"string",
"doc":"this is version for transaction"
}
]
}
},
{
"name":"name",
"type":"string"
},
{
"name":"dept",
"type":"string"
},
{
"name":"empnumber",
"type":"string"
}
]
}
This is my test code which I tried...
#ComponentTest
#RunWith(SpringRunner.class)
#EmbeddedKafka(partitions = 1, topics = { "positionmgmt.v1" })
#SpringBootTest(classes={Application.class})
#DirtiesContext
public class ConsumeKafkaMessageTest {
private static final String TEST_TOPIC = "positionmgmt.v1";
#Autowired(required=true)
EmbeddedKafkaBroker embeddedKafkaBroker;
private Schema schema;
private SchemaRegistryClient schemaRegistry;
private KafkaAvroSerializer avroSerializer;
private KafkaAvroDeserializer avroDeserializer;
private MockSchemaRegistryClient mockSchemaRegistryClient = new MockSchemaRegistryClient();
private String registryUrl = "unused";
private String avroSchema = string representation of avro schema
#BeforeEach
public void setUp() throws Exception {
Schema.Parser parser = new Schema.Parser();
schema = parser.parse(avroSchema);
mockSchemaRegistryClient.register("Vendors-value", schema);
}
#Test
public void consumeKafkaMessage_receive_sucess() {
Schema metadataSchema = schema.getField("metadata").schema();
GenericRecord metadata = new GenericData.Record(metadataSchema);
metadata.put("version", "1.0");
metadata.put("correlationid", "correlationid");
metadata.put("subject", "metadata");
GenericRecord record = new GenericData.Record(schema);
record.put("metadata", metadata);
record.put("name", "ABC");
record.put("dept", "XYZ");
Consumer<String, GenericRecord> consumer = configureConsumer();
Producer<String, GenericRecord> producer = configureProducer();
ProducerRecord<String, GenericRecord> prodRecord = new ProducerRecord<String, GenericRecord>(TEST_TOPIC, record);
producer.send(prodRecord);
ConsumerRecord<String, GenericRecord> singleRecord = KafkaTestUtils.getSingleRecord(consumer, TEST_TOPIC);
assertNotNull(singleRecord.value());
consumer.close();
producer.close();
}
private Consumer<String, GenericRecord> configureConsumer() {
Map<String, Object> consumerProps = KafkaTestUtils.consumerProps("groupid", "true", embeddedKafkaBroker);
consumerProps.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
Consumer<String, GenericRecord> consumer = new DefaultKafkaConsumerFactory<String, GenericRecord>(consumerProps).createConsumer();
consumer.subscribe(Collections.singleton(TEST_TOPIC));
return consumer;
}
private Producer<String, GenericRecord> configureProducer() {
Map<String, Object> producerProps = new HashMap<>(KafkaTestUtils.producerProps(embeddedKafkaBroker));
producerProps.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
producerProps.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, KafkaAvroSerializer.class.getName());
producerProps.put(KafkaAvroSerializerConfig.SCHEMA_REGISTRY_URL_CONFIG, mockSchemaRegistryClient);
producerProps.put(KafkaAvroSerializerConfig.AUTO_REGISTER_SCHEMAS, "false");
return new DefaultKafkaProducerFactory<String, GenericRecord>(producerProps).createProducer();
}
}
Error
component.com.ms.listener.ConsumeKafkaMessageTest > consumeKafkaMessage_receive_sucess() FAILED
org.apache.kafka.common.KafkaException: Failed to construct kafka producer
at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:457)
at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:289)
at org.springframework.kafka.core.DefaultKafkaProducerFactory.createKafkaProducer(DefaultKafkaProducerFactory.java:318)
at org.springframework.kafka.core.DefaultKafkaProducerFactory.createProducer(DefaultKafkaProducerFactory.java:305)
at component.com.ms.listener.ConsumeKafkaMessageTest.configureProducer(ConsumeKafkaMessageTest.java:125)
at component.com.ms.listener.ConsumeKafkaMessageTest.consumeKafkaMessage_receive_sucess(ConsumeKafkaMessageTest.java:97)
Caused by:
io.confluent.common.config.ConfigException: Invalid value io.confluent.kafka.schemaregistry.client.MockSchemaRegistryClient#20751870 for configuration schema.registry.url: Expected a comma separated list.
at io.confluent.common.config.ConfigDef.parseType(ConfigDef.java:345)
at io.confluent.common.config.ConfigDef.parse(ConfigDef.java:249)
at io.confluent.common.config.AbstractConfig.<init>(AbstractConfig.java:78)
at io.confluent.kafka.serializers.AbstractKafkaAvroSerDeConfig.<init>(AbstractKafkaAvroSerDeConfig.java:105)
at io.confluent.kafka.serializers.KafkaAvroSerializerConfig.<init>(KafkaAvroSerializerConfig.java:32)
at io.confluent.kafka.serializers.KafkaAvroSerializer.configure(KafkaAvroSerializer.java:48)
at org.apache.kafka.common.serialization.ExtendedSerializer$Wrapper.configure(ExtendedSerializer.java:60)
at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:372)
... 5 more
I investigated it a bit and I found out that the problem is in the CashedSchemaRegistryClient that is used by the KafkaAvroSerializer/Deserializer. It is used to fetch the schema definitions from the Confluent Schema Registry.
You already have your schema definition locally so you don't need to go to Schema Registry for them. (at least in your tests)
I had a similar problem and I solved it by creating a custom KafkaAvroSerializer/KafkaAvroDeserializer.
This is a sample of KafkaAvroSerializer. It is rather simple. You just need to extend the provided KafkaAvroSerializer and tell him to use MockSchemaRegistryClient.
public class CustomKafkaAvroSerializer extends KafkaAvroSerializer {
public CustomKafkaAvroSerializer() {
super();
super.schemaRegistry = new MockSchemaRegistryClient();
}
public CustomKafkaAvroSerializer(SchemaRegistryClient client) {
super(new MockSchemaRegistryClient());
}
public CustomKafkaAvroSerializer(SchemaRegistryClient client, Map<String, ?> props) {
super(new MockSchemaRegistryClient(), props);
}
}
This is a sample of KafkaAvroDeserializer. When the deserialize method is called you need to tell him which schema to use.
public class CustomKafkaAvroDeserializer extends KafkaAvroDeserializer {
#Override
public Object deserialize(String topic, byte[] bytes) {
this.schemaRegistry = getMockClient(KafkaEvent.SCHEMA$);
return super.deserialize(topic, bytes);
}
private static SchemaRegistryClient getMockClient(final Schema schema$) {
return new MockSchemaRegistryClient() {
#Override
public synchronized Schema getById(int id) {
return schema$;
}
};
}
}
The last step is to tell spring to use created Serializer/Deserializer
spring.kafka.producer.properties.schema.registry.url= not-used
spring.kafka.producer.value-serializer = CustomKafkaAvroSerializer
spring.kafka.producer.key-serializer = org.apache.kafka.common.serialization.StringSerializer
spring.kafka.producer.group-id = showcase-producer-id
spring.kafka.consumer.properties.schema.registry.url= not-used
spring.kafka.consumer.value-deserializer = CustomKafkaAvroDeserializer
spring.kafka.consumer.key-deserializer = org.apache.kafka.common.serialization.StringDeserializer
spring.kafka.consumer.group-id = showcase-consumer-id
spring.kafka.auto.offset.reset = earliest
spring.kafka.producer.auto.register.schemas= true
spring.kafka.properties.specific.avro.reader= true
I wrote a short blog post about that:
https://medium.com/#igorvlahek1/no-need-for-schema-registry-in-your-spring-kafka-tests-a5b81468a0e1?source=friends_link&sk=e55f73b86504e9f577e259181c8d0e23
Link to the working sample project: https://github.com/ivlahek/kafka-avro-without-registry
The answer from #ivlahek is working, but if you look at this example 3 year later you might want to do slight modification to CustomKafkaAvroDeserializer
private static SchemaRegistryClient getMockClient(final Schema schema) {
return new MockSchemaRegistryClient() {
#Override
public ParsedSchema getSchemaBySubjectAndId(String subject, int id)
throws IOException, RestClientException {
return new AvroSchema(schema);
}
};
}
As the error says, you need to provide a string to the registry in the producer config, not an object.
Since you're using the Mock class, that string could be anything...
However, you'll need to construct the serializers given the registry instance
Serializer serializer = new KafkaAvroSerializer(mockSchemaRegistry);
// make config map with ("schema.registry.url", "unused")
serializer.configure(config, false);
Otherwise, it will try to create a non-mocked client
And put that into the properties
producerProps.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, serializer);
If your #KafkaListener is in test class then you can read it in StringDeserializer then convert it to the desired class manually
#Autowired
private MyKafkaAvroDeserializer myKafkaAvroDeserializer;
#KafkaListener( topics = "test")
public void inputData(ConsumerRecord<?, ?> consumerRecord) {
log.info("received payload='{}'", consumerRecord.toString(),consumerRecord.value());
GenericRecord genericRecord = (GenericRecord)myKafkaAvroDeserializer.deserialize("test",consumerRecord.value().toString().getBytes(StandardCharsets.UTF_8));
Myclass myclass = (Myclass) SpecificData.get().deepCopy(Myclass.SCHEMA$, genericRecord);
}
#Component
public class MyKafkaAvroDeserializer extends KafkaAvroDeserializer {
#Override
public Object deserialize(String topic, byte[] bytes) {
this.schemaRegistry = getMockClient(Myclass.SCHEMA$);
return super.deserialize(topic, bytes);
}
private static SchemaRegistryClient getMockClient(final Schema schema$) {
return new MockSchemaRegistryClient() {
#Override
public synchronized org.apache.avro.Schema getById(int id) {
return schema$;
}
};
}
}
Remember to add schema registry and key/value serializer in application.yml although it won't be used
consumer:
key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
value-deserializer: org.apache.kafka.common.serialization.StringDeserializer
properties:
schema.registry.url :http://localhost:8080

Spring Cloud Stream Multi Topic Transaction Management

I'm trying to create a PoC application in Java to figure out how to do transaction management in Spring Cloud Stream when using Kafka for message publishing. The use case I'm trying to simulate is a processor that receives a message. It then does some processing and generates two new messages destined to two separate topics. I want to be able to handle publishing both messages as a single transaction. So, if publishing the second message fails I want to roll (not commit) the first message. Does Spring Cloud Stream support such a use case?
I've set the #Transactional annotation and I can see a global transaction starting before the message is delivered to the consumer. However, when I try to publish a message via the MessageChannel.send() method I can see that a new local transaction is started and completed in the KafkaProducerMessageHandler class' handleRequestMessage() method. Which means that the sending of the message does not participate in the global transaction. So, if there's an exception thrown after the publishing of the first message, the message will not be rolled back. The global transaction gets rolled back but that doesn't do anything really since the first message was already committed.
spring:
cloud:
stream:
kafka:
binder:
brokers: localhost:9092
transaction:
transaction-id-prefix: txn.
producer: # these apply to all producers that participate in the transaction
partition-key-extractor-name: partitionKeyExtractorStrategy
partition-selector-name: partitionSelectorStrategy
partition-count: 3
configuration:
acks: all
enable:
idempotence: true
retries: 10
bindings:
input-customer-data-change-topic:
consumer:
configuration:
isolation:
level: read_committed
enable-dlq: true
bindings:
input-customer-data-change-topic:
content-type: application/json
destination: com.fis.customer
group: com.fis.ec
consumer:
partitioned: true
max-attempts: 1
output-name-change-topic:
content-type: application/json
destination: com.fis.customer.name
output-email-change-topic:
content-type: application/json
destination: com.fis.customer.email
#SpringBootApplication
#EnableBinding(CustomerDataChangeStreams.class)
public class KafkaCloudStreamCustomerDemoApplication
{
public static void main(final String[] args)
{
SpringApplication.run(KafkaCloudStreamCustomerDemoApplication.class, args);
}
}
public interface CustomerDataChangeStreams
{
#Input("input-customer-data-change-topic")
SubscribableChannel inputCustomerDataChange();
#Output("output-email-change-topic")
MessageChannel outputEmailDataChange();
#Output("output-name-change-topic")
MessageChannel outputNameDataChange();
}
#Component
public class CustomerDataChangeListener
{
#Autowired
private CustomerDataChangeProcessor mService;
#StreamListener("input-customer-data-change-topic")
public Message<String> handleCustomerDataChangeMessages(
#Payload final ImmutableCustomerDetails customerDetails)
{
return mService.processMessage(customerDetails);
}
}
#Component
public class CustomerDataChangeProcessor
{
private final CustomerDataChangeStreams mStreams;
#Value("${spring.cloud.stream.bindings.output-email-change-topic.destination}")
private String mEmailChangeTopic;
#Value("${spring.cloud.stream.bindings.output-name-change-topic.destination}")
private String mNameChangeTopic;
public CustomerDataChangeProcessor(final CustomerDataChangeStreams streams)
{
mStreams = streams;
}
public void processMessage(final CustomerDetails customerDetails)
{
try
{
sendNameMessage(customerDetails);
sendEmailMessage(customerDetails);
}
catch (final JSONException ex)
{
LOGGER.error("Failed to send messages.", ex);
}
}
public void sendNameMessage(final CustomerDetails customerDetails)
throws JSONException
{
final JSONObject nameChangeDetails = new JSONObject();
nameChangeDetails.put(KafkaConst.BANK_ID_KEY, customerDetails.bankId());
nameChangeDetails.put(KafkaConst.CUSTOMER_ID_KEY, customerDetails.customerId());
nameChangeDetails.put(KafkaConst.FIRST_NAME_KEY, customerDetails.firstName());
nameChangeDetails.put(KafkaConst.LAST_NAME_KEY, customerDetails.lastName());
final String action = customerDetails.action();
nameChangeDetails.put(KafkaConst.ACTION_KEY, action);
final MessageChannel nameChangeMessageChannel = mStreams.outputNameDataChange();
emailChangeMessageChannel.send(MessageBuilder.withPayload(nameChangeDetails.toString())
.setHeader(MessageHeaders.CONTENT_TYPE, MimeTypeUtils.APPLICATION_JSON)
.setHeader(KafkaHeaders.TOPIC, mNameChangeTopic).build());
if ("fail_name_illegal".equalsIgnoreCase(action))
{
throw new IllegalArgumentException("Customer name failure!");
}
}
public void sendEmailMessage(final CustomerDetails customerDetails) throws JSONException
{
final JSONObject emailChangeDetails = new JSONObject();
emailChangeDetails.put(KafkaConst.BANK_ID_KEY, customerDetails.bankId());
emailChangeDetails.put(KafkaConst.CUSTOMER_ID_KEY, customerDetails.customerId());
emailChangeDetails.put(KafkaConst.EMAIL_ADDRESS_KEY, customerDetails.email());
final String action = customerDetails.action();
emailChangeDetails.put(KafkaConst.ACTION_KEY, action);
final MessageChannel emailChangeMessageChannel = mStreams.outputEmailDataChange();
emailChangeMessageChannel.send(MessageBuilder.withPayload(emailChangeDetails.toString())
.setHeader(MessageHeaders.CONTENT_TYPE, MimeTypeUtils.APPLICATION_JSON)
.setHeader(KafkaHeaders.TOPIC, mEmailChangeTopic).build());
if ("fail_email_illegal".equalsIgnoreCase(action))
{
throw new IllegalArgumentException("E-mail address failure!");
}
}
}
EDIT
We are getting closer. The local transaction does not get created anymore. However, the global transaction still gets committed even if there was an exception. From what I can tell the exception does not propagate to the TransactionTemplate.execute() method. Therefore, the transaction gets committed. It seems like that the MessageProducerSupport class in the sendMessage() method "swallows" the exception in the catch clause. If there's an error channel defined then a message is published to it and thus the exception is not rethrown. I tried turning the error channel off (spring.cloud.stream.kafka.binder.transaction.producer.error-channel-enabled = false) but that doesn't turn it off. So, just for a test I simply set the error channel to null in the debugger to force the exception to be rethrown. That seems to do it. However, the original message keeps getting redelivered to the initial consumer even though I have the max-attempts set to 1 for that consumer.
See the documentation.
spring.cloud.stream.kafka.binder.transaction.transactionIdPrefix
Enables transactions in the binder. See transaction.id in the Kafka documentation and Transactions in the spring-kafka documentation. When transactions are enabled, individual producer properties are ignored and all producers use the spring.cloud.stream.kafka.binder.transaction.producer.* properties.
Default null (no transactions)
spring.cloud.stream.kafka.binder.transaction.producer.*
Global producer properties for producers in a transactional binder. See spring.cloud.stream.kafka.binder.transaction.transactionIdPrefix and Kafka Producer Properties and the general producer properties supported by all binders.
Default: See individual producer properties.
You must configure the shared global producer.
Don't add #Transactional - the container will start the transaction and send the offset to the transaction before committing the transaction.
If the listener throws an exception, the transaction is rolled back and the DefaultAfterRollbackPostProcessor will re-seek the topics/partitions so that the record will be redelivered.
EDIT
There is a bug in the configuration of the binder's transaction manager that causes a new local transaction to be started by the output binding.
To work around it, reconfigure the TM with the following container customizer bean...
#Bean
public ListenerContainerCustomizer<AbstractMessageListenerContainer<byte[], byte[]>> customizer() {
return (container, dest, group) -> {
KafkaTransactionManager<?, ?> tm = (KafkaTransactionManager<?, ?>) container.getContainerProperties()
.getTransactionManager();
tm.setTransactionSynchronization(AbstractPlatformTransactionManager.SYNCHRONIZATION_ON_ACTUAL_TRANSACTION);
};
}
EDIT2
You can't use the binder's DLQ support because, from the container's perspective, the delivery was successful. We need to propagate the exception to the container to force a rollback. So, you need to move the dead-lettering to the AfterRollbackProcessor instead. Here is my complete test class:
#SpringBootApplication
#EnableBinding(Processor.class)
public class So57379575Application {
public static void main(String[] args) {
SpringApplication.run(So57379575Application.class, args);
}
#Autowired
private MessageChannel output;
#StreamListener(Processor.INPUT)
public void listen(String in) {
System.out.println("in:" + in);
this.output.send(new GenericMessage<>(in.toUpperCase()));
if (in.equals("two")) {
throw new RuntimeException("fail");
}
}
#KafkaListener(id = "so57379575", topics = "so57379575out")
public void listen2(String in) {
System.out.println("out:" + in);
}
#KafkaListener(id = "so57379575DLT", topics = "so57379575dlt")
public void listen3(String in) {
System.out.println("dlt:" + in);
}
#Bean
public ApplicationRunner runner(KafkaTemplate<byte[], byte[]> template) {
return args -> {
template.send("so57379575in", "one".getBytes());
template.send("so57379575in", "two".getBytes());
};
}
#Bean
public ListenerContainerCustomizer<AbstractMessageListenerContainer<byte[], byte[]>> customizer(
KafkaTemplate<Object, Object> template) {
return (container, dest, group) -> {
// enable transaction synchronization
KafkaTransactionManager<?, ?> tm = (KafkaTransactionManager<?, ?>) container.getContainerProperties()
.getTransactionManager();
tm.setTransactionSynchronization(AbstractPlatformTransactionManager.SYNCHRONIZATION_ON_ACTUAL_TRANSACTION);
// container dead-lettering
DefaultAfterRollbackProcessor<? super byte[], ? super byte[]> afterRollbackProcessor =
new DefaultAfterRollbackProcessor<>(new DeadLetterPublishingRecoverer(template,
(ex, tp) -> new TopicPartition("so57379575dlt", -1)), 0);
container.setAfterRollbackProcessor(afterRollbackProcessor);
};
}
}
and
spring:
kafka:
bootstrap-servers:
- 10.0.0.8:9092
- 10.0.0.8:9093
- 10.0.0.8:9094
consumer:
auto-offset-reset: earliest
enable-auto-commit: false
properties:
isolation.level: read_committed
cloud:
stream:
bindings:
input:
destination: so57379575in
group: so57379575in
consumer:
max-attempts: 1
output:
destination: so57379575out
kafka:
binder:
transaction:
transaction-id-prefix: so57379575tx.
producer:
configuration:
acks: all
retries: 10
#logging:
# level:
# org.springframework.kafka: trace
# org.springframework.transaction: trace
and
in:two
2019-08-07 12:43:33.457 ERROR 36532 --- [container-0-C-1] o.s.integration.handler.LoggingHandler : org.springframework.messaging.MessagingException: Exception thrown while
...
Caused by: java.lang.RuntimeException: fail
...
in:one
dlt:two
out:ONE

How to skip corrupt (non-serializable) messages in Spring Kafka Consumer?

This question is for Spring Kafka, related to Apache Kafka with High Level Consumer: Skip corrupted messages
Is there a way to configure Spring Kafka consumer to skip a record that cannot be read/processed (is corrupt)?
I am seeing a situation where the consumer gets stuck on the same record if it cannot be deserialized. This is the error the consumer throws.
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Can not construct instance of java.time.LocalDate: no long/Long-argument constructor/factory method to deserialize from Number value
The consumer polls the topic and just keeps printing the same error in a loop till program is killed.
In a #KafkaListener that has the following Consumer factory configurations,
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, JsonDeserializer.class);
You need ErrorHandlingDeserializer: https://docs.spring.io/spring-kafka/docs/2.2.0.RELEASE/reference/html/_reference.html#error-handling-deserializer
If you can't move to that 2.2 version, consider to implement your own and return null for those records which can't be deserialized properly.
The source code is here: https://github.com/spring-projects/spring-kafka/blob/master/spring-kafka/src/main/java/org/springframework/kafka/support/serializer/ErrorHandlingDeserializer2.java
In case you are using older version of kafka, in a #KafkaListener set the following Consumer factory configurations.
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, CustomDeserializer.class);
Here is the code for CustomDeserializer:
import java.util.Map;
import org.apache.kafka.common.serialization.Deserializer;
import com.fasterxml.jackson.databind.ObjectMapper;
public class CustomDeserializer implements Deserializer<Object>
{
#Override
public void configure( Map<String, ?> configs, boolean isKey )
{
}
#Override
public Object deserialize( String topic, byte[] data )
{
ObjectMapper mapper = new ObjectMapper();
Object object = null;
try
{
object = mapper.readValue(data, Object.class);
}
catch ( Exception exception )
{
System.out.println("Error in deserializing bytes " + exception);
}
return object;
}
#Override
public void close()
{
}
}
Since I want my code to be generic enough to read any kind of json,
object = mapper.readValue(data, Object.class); I am converting it to Object.class. And as we are catching exception here, it won't be retried once read.

Categories