How to map my stream value to my object class - java

I do have a Song.Class (song_id, song_name, song_duration).
package org.example.demo.model;
public class Flux {
private int user_id;
private int song_id;
private float listening_duration;
public Flux(int user_id, int song_id, float listening_duration) {
this.user_id = user_id;
this.song_id = song_id;
this.listening_duration = listening_duration;
I do have a first program to send some events to a kafka topic (serialize in Avro) :
props.put("key.serializer", StringSerializer.class.getName());
props.put("value.serializer", KafkaAvroSerializer.class.getName());
KafkaProducer<String, Flux> kafkaProducerFlux = new KafkaProducer<String, Flux>(props);
Witin a loop:
Flux flux = Flux.newBuilder()
kafkaProducerFlux.send(new ProducerRecord<String, Flux>(topic_events, flux));
Now, I would like to get back my object in a stream as :
KStream<String, Flux> source ="music_flux");
KStream<String, GenericRecord> source ="music_flux");
To bet able to join the user_id with inputs from another streams.
Thanks for your help.

You have to configure the key and value deserializer in the kafka stream properties. Since the Flux schema is written in schema registry via your Kafka producer, Kstream application will read the schema from there. You need to configure "SCHEMA_REGISTRY_URL_CONFIG" and "VALUE_SERDE_CLASS_CONFIG" as given below:
KStreamBuilder builder = new KStreamBuilder();
Properties props= new Properties();
// Set a few key parameters
props.put(StreamsConfig.APPLICATION_ID_CONFIG, "kafka-app-1");
props.put(StreamsConfig.KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
props.put(StreamsConfig.VALUE_SERDE_CLASS_CONFIG, SpecificAvroSerde.class);
props.put(AbstractKafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG, "");
StreamsConfig config = new StreamsConfig(props);
KStream<String, Flux> source ="music_flux");


Problem creating tests with Spring cloud streams kafka streams using embeddedKafka with MockSchemaRegistryClient

I'm trying to figure out how i can test my Spring Cloud Streams Kafka-Streams application.
The application lookls like this:
Stream 1: Topic1 > Topic2
Stream 2: Topic2 + Topic3 joined > Topic4
Stream 3: Topic4 > Topic5
I tried different approaches like the TestChannelBinder but this approach only works with Simple functions not Streams and Avro.
I decided to use EmbeddedKafka with MockSchemaRegistryClient. I can produce to a topic and also consume from the same topic again (topic1) but i'm not able to consume from (topic2).
In my test application.yaml i put the following configuration (i'm only testing the first stream for now, i want to extend it once this works): processingapp
function.definition: stream1 # not now ;stream2;stream3
destination: topic1
destination: topic2
min-partition-count: 1
replication-factor: 1
auto-create-topics: true
auto-add-partitions: true
autoRebalanceEnabled: true
resetOffsets: true
startOffset: earliest
keySerde: io.confluent.kafka.streams.serdes.avro.PrimitiveAvroSerde
valueSerde: io.confluent.kafka.streams.serdes.avro.SpecificAvroSerde
keySerde: io.confluent.kafka.streams.serdes.avro.PrimitiveAvroSerde
valueSerde: io.confluent.kafka.streams.serdes.avro.SpecificAvroSerde
schema.registry.url: mock://localtest
specivic.avro.reader: true
My test looks like the following:
public class Test {
private static final String INPUT_TOPIC = "topic1";
private static final String OUTPUT_TOPIC = "topic2";
public static EmbeddedKafkaRule embeddedKafka = new EmbeddedKafkaRule(1, true, 1, INPUT_TOPIC, OUTPUT_TOPIC);
public static void setup() {
System.setProperty("", embeddedKafka.getEmbeddedKafka().getBrokersAsString());
public void testSendReceive() throws IOException {
Map<String, Object> senderProps = KafkaTestUtils.producerProps(embeddedKafka.getEmbeddedKafka());
senderProps.put("key.serializer", LongSerializer.class);
senderProps.put("value.serializer", SpecificAvroSerializer.class);
senderProps.put("schema.registry.url", "mock://localtest");
AvroFileParser fileParser = new AvroFileParser();
DefaultKafkaProducerFactory<Long, Test1> pf = new DefaultKafkaProducerFactory<>(senderProps);
KafkaTemplate<Long, Test1> template = new KafkaTemplate<>(pf, true);
Test1 test1 = fileParser.parseTest1("src/test/resources/mocks/test1.json");
template.send(INPUT_TOPIC, 123456L, test1);
Map<String, Object> consumer1Props = KafkaTestUtils.consumerProps("testConsumer1", "false", embeddedKafka.getEmbeddedKafka());
consumer1Props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
consumer1Props.put("key.deserializer", LongDeserializer.class);
consumer1Props.put("value.deserializer", SpecificAvroDeserializer.class);
consumer1Props.put("schema.registry.url", "mock://localtest");
DefaultKafkaConsumerFactory<Long, Test1> cf = new DefaultKafkaConsumerFactory<>(consumer1Props);
Consumer<Long, Test1> consumer1 = cf.createConsumer();
ConsumerRecords<Long, Test1> records = consumer1.poll(Duration.ofSeconds(10));
System.out.println("records count?");
System.out.println("" + records.count());
Test1 fetchedTest1;
fetchedTest1 = records.iterator().next().value();
System.out.println("found record");
Map<String, Object> consumer2Props = KafkaTestUtils.consumerProps("testConsumer2", "false", embeddedKafka.getEmbeddedKafka());
consumer2Props.put("key.deserializer", StringDeserializer.class);
consumer2Props.put("value.deserializer", TestAvroDeserializer.class);
consumer2Props.put("schema.registry.url", "mock://localtest");
DefaultKafkaConsumerFactory<String, Test2> consumer2Factory = new DefaultKafkaConsumerFactory<>(consumer2Props);
Consumer<String, Test2> consumer2 = consumer2Factory.createConsumer();
ConsumerRecords<String, Test2> records2 = consumer2.poll(Duration.ofSeconds(30));
if (records2.iterator().hasNext()) {
System.out.println("has next");
} else {
System.out.println("has no next");
I receive the following exception when trying to consume and deserialize from topic2:
Caused by: org.apache.kafka.common.errors.SerializationException: Error retrieving Avro unknown schema for id 0
Caused by: Cannot get schema from schema registry!
at io.confluent.kafka.schemaregistry.client.MockSchemaRegistryClient.getSchemaBySubjectAndIdFromRegistry( ~[kafka-schema-registry-client-6.2.0.jar:na]
at io.confluent.kafka.schemaregistry.client.MockSchemaRegistryClient.getSchemaBySubjectAndId( ~[kafka-schema-registry-client-6.2.0.jar:na]
at io.confluent.kafka.schemaregistry.client.MockSchemaRegistryClient.getSchemaById( ~[kafka-schema-registry-client-6.2.0.jar:na]
at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer$DeserializationContext.schemaFromRegistry( ~[kafka-avro-serializer-6.2.0.jar:na]
at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserialize( ~[kafka-avro-serializer-6.2.0.jar:na]
at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserialize( ~[kafka-avro-serializer-6.2.0.jar:na]
at io.confluent.kafka.serializers.KafkaAvroDeserializer.deserialize( ~[kafka-avro-serializer-6.2.0.jar:na]
at org.apache.kafka.common.serialization.Deserializer.deserialize( ~[kafka-clients-2.7.1.jar:na]
at org.apache.kafka.streams.processor.internals.SourceNode.deserializeKey( ~[kafka-streams-2.7.1.jar:na]
at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize( ~[kafka-streams-2.7.1.jar:na]
at org.apache.kafka.streams.processor.internals.RecordQueue.updateHead( ~[kafka-streams-2.7.1.jar:na]
at org.apache.kafka.streams.processor.internals.RecordQueue.addRawRecords( ~[kafka-streams-2.7.1.jar:na]
at org.apache.kafka.streams.processor.internals.PartitionGroup.addRawRecords( ~[kafka-streams-2.7.1.jar:na]
at org.apache.kafka.streams.processor.internals.StreamTask.addRecords( ~[kafka-streams-2.7.1.jar:na]
at org.apache.kafka.streams.processor.internals.TaskManager.addRecordsToTasks( ~[kafka-streams-2.7.1.jar:na]
at org.apache.kafka.streams.processor.internals.StreamThread.pollPhase( ~[kafka-streams-2.7.1.jar:na]
at org.apache.kafka.streams.processor.internals.StreamThread.runOnce( ~[kafka-streams-2.7.1.jar:na]
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop( ~[kafka-streams-2.7.1.jar:na]
at ~[kafka-streams-2.7.1.jar:na]
There won't be a message consumed.
So i tried to overwrite the SpecificAvroSerde and register the schemas directly and use this deserializer.
public class TestAvroDeserializer<T extends org.apache.avro.specific.SpecificRecord>
extends SpecificAvroDeserializer<T> implements Deserializer<T> {
private final KafkaAvroDeserializer inner;
public TestAvroDeserializer() throws IOException, RestClientException {
MockSchemaRegistryClient mockedClient = new MockSchemaRegistryClient();
Schema.Parser parser = new Schema.Parser();
Schema test2Schema = parser.parse(new File("./src/main/resources/avro/test2.avsc"));
mockedClient.register("test2-value", test2Schema , 1, 0);
inner = new KafkaAvroDeserializer(mockedClient);
* For testing purposes only.
TestAvroDeserializer(final SchemaRegistryClient client) throws IOException, RestClientException {
MockSchemaRegistryClient mockedClient = new MockSchemaRegistryClient();
Schema.Parser parser = new Schema.Parser();
Schema test2Schema = parser.parse(new File("./src/main/resources/avro/test2.avsc"));
mockedClient.register("test2-value", test2Schema , 1, 0);
inner = new KafkaAvroDeserializer(mockedClient);
With this deserializer it won't work too. Does anyone have experience on how to do this tests with EmbeddedKafka and MockSchemaRegistry? Or is there another approach i should use?
I'm very glad if someone can help. Thank you in advance.
I found an appropriate way of integration testing my topology.
I use the TopologyTestDriver from the kafka-streams-test-utils package.
Include this dependency to Maven:
For the application described in the question setting up the TopologyTestDriver would look like the following. This code is just sequentially to show how it works.
void test() {
keySerde.configure(Map.of(KafkaAvroSerializerConfig.SCHEMA_REGISTRY_URL_CONFIG, "mock://schemas"), true);
valueSerdeTopic1.configure(Map.of(KafkaAvroSerializerConfig.SCHEMA_REGISTRY_URL_CONFIG, "mock://schemas"), false);
valueSerdeTopic2.configure(Map.of(KafkaAvroSerializerConfig.SCHEMA_REGISTRY_URL_CONFIG, "mock://schemas"), false);
final StreamsBuilder builder = new StreamsBuilder();
Configuration config = new Configuration(); // class where you declare your spring cloud stream functions
KStream<String, Topic1> input ="topic1", Consumed.with(keySerde, valueSerdeTopic1));
KStream<String, Topic2> output = config.stream1().apply(input);"topic2");
Topology topology =;
Properties streamsConfig = new Properties();
org.apache.kafka.streams.StreamsConfig.APPLICATION_ID_CONFIG, "toplogy-test-driver",
org.apache.kafka.streams.StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "ignored",
KafkaAvroSerializerConfig.SCHEMA_REGISTRY_URL_CONFIG, "mock://schemas",
org.apache.kafka.streams.StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, PrimitiveAvroSerde.class.getName(),
org.apache.kafka.streams.StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, SpecificAvroSerde.class.getName()
TopologyTestDriver testDriver = new TopologyTestDriver(topology, streamsConfig);
TestInputTopic<String, Topic1> inputTopic = testDriver.createInputTopic("topic1", keySerde.serializer(), valueSerdeTopic1.serializer());
TestOutputTopic<String, Topic2> outputTopic = testDriver.createOutputTopic("topic2", keySerde.deserializer(), valueSerdeTopic2.deserializer());
inputTopic.pipeInput("key", topic1AvroModel); // Write to the input topic which applies the topology processor of your spring-cloud-stream app
KeyValue<String, Topic2> outputRecord = outputTopic.readKeyValue(); // Read from the output topic
If you write more tests i recommend to abstract the setup code to not repeat yourself for each test.
I highly suggest this example from the spring-cloud-streams-samples repository, it leaded me to the solution to use TopologyTestDriver.

How to get Kafka Producer messages count

I use following code to create one producer which produces around 2000 messages.
public class ProducerDemoWithCallback {
public static void main(String[] args) {
final Logger logger = LoggerFactory.getLogger(ProducerDemoWithCallback.class);
String bootstrapServers = "localhost:9092";
Properties properties = new Properties();
properties.setProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
properties.setProperty(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
properties.setProperty(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
// create the producer
KafkaProducer<String, String> producer = new KafkaProducer<String, String>(properties);
for (int i=0; i<2000; i++ ) {
// create a producer record
ProducerRecord<String, String> record =
new ProducerRecord<String, String>("TwitterProducer", "Hello World " + Integer.toString(i));
// send data - asynchronous
producer.send(record, new Callback() {
public void onCompletion(RecordMetadata recordMetadata, Exception e) {
// executes every time a record is successfully sent or an exception is thrown
if (e == null) {
// the record was successfully sent
logger .info("Received new metadata. \n" +
"Topic:" + recordMetadata.topic() + "\n" +
"Partition: " + recordMetadata.partition() + "\n" +
"Offset: " + recordMetadata.offset() + "\n" +
"Timestamp: " + recordMetadata.timestamp());
} else {
logger .error("Error while producing", e);
// flush data
// flush and close producer
I want to count those messages and get int value.
I use this command and it works, but i am trying to get this count using code.
"bin/ --broker-list localhost:9092 --topic TwitterProducer --time -1"
and the result is
- TwitterProducer:0:2000
My code to do the same programmatically looks something like this, but I'm not sure if this is the correct way to get the count:
int valueCount = (int) recordMetadata.offset();
System.out.println("Offset value " + valueCount);
Can someone help me to get count of Kafka messages offset value using code.
You can have a look at implementation details of GetOffsetShell.
Here is a simplified code re-written in Java:
import org.apache.kafka.clients.consumer.ConsumerConfig;
import org.apache.kafka.clients.consumer.KafkaConsumer;
import org.apache.kafka.common.TopicPartition;
import org.apache.kafka.common.serialization.StringDeserializer;
import java.util.*;
public class GetOffsetCommand {
private static final Set<String> TopicNames = new HashSet<>();
static {
public static void main(String[] args) {
TopicNames.forEach(topicName -> {
final Map<TopicPartition, Long> offsets = getOffsets(topicName);
new ArrayList<>(offsets.entrySet()).forEach(System.out::println);
System.out.println(topicName + ":" + offsets.values().stream().reduce(0L, Long::sum));
private static Map<TopicPartition, Long> getOffsets(String topicName) {
final KafkaConsumer<String, String> consumer = makeKafkaConsumer();
final List<TopicPartition> partitions = listTopicPartitions(consumer, topicName);
return consumer.endOffsets(partitions);
private static KafkaConsumer<String, String> makeKafkaConsumer() {
final Properties props = new Properties();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
props.put(ConsumerConfig.GROUP_ID_CONFIG, "get-offset-command");
return new KafkaConsumer<>(props);
private static List<TopicPartition> listTopicPartitions(KafkaConsumer<String, String> consumer, String topicName) {
return consumer.listTopics().entrySet().stream()
.filter(t -> topicName.equals(t.getKey()))
.flatMap(t -> t.getValue().stream())
.map(p -> new TopicPartition(p.topic(), p.partition()))
which produces the offset for each topic's partition and sum (total number of messages), like:
Why do you want to get that value? If you share more detail about the purpose, I can give you more good tip.
For your last question, it's not the correct way to get the count of messages with the offset value. If your topic has one partition and the producer is one, you can use it. You need to consider that the topic has several partitions.
If you want to get the number of messages from each producer, you can count it in the callback function that is onCompletion()
Or you can get the last offset using Consumer client like this:
Properties props = new Properties();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "your-brokers");
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
Consumer<Long, String> consumer = new KafkaConsumer<>(props);
Collection<TopicPartition> partitions = consumer.assignment();
for(TopicPartition tp: partitions) {
long offsetPosition = consumer.position(tp);

Accessing Kafka Topic with two process

I have a Kafka producer class which works fine. The producer fills the Kafka topic. Its code is in following:
public class kafka_test {
private final static String TOPIC = "flinkTopic";
private final static String BOOTSTRAP_SERVERS = ",,";
public FlinkKafkaConsumer<String> createStringConsumerForTopic(
String topic, String kafkaAddress, String kafkaGroup) {
// ************************** KAFKA Properties ******
Properties props = new Properties();
props.setProperty("bootstrap.servers", kafkaAddress);
props.setProperty("", kafkaGroup);
FlinkKafkaConsumer<String> myconsumer = new FlinkKafkaConsumer<>(
topic, new SimpleStringSchema(), props);
return myconsumer;
private static Producer<Long, String> createProducer() {
Properties props = new Properties();
props.put(ProducerConfig.CLIENT_ID_CONFIG, "MyKafkaProducer");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, LongSerializer.class.getName());
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
return new KafkaProducer<>(props);
public void runProducer(String msg) throws Exception {
final Producer<Long, String> producer = createProducer();
try {
final ProducerRecord<Long, String> record = new ProducerRecord<>(TOPIC, msg );
RecordMetadata metadata = producer.send(record).get();
System.out.printf("sent record(key=%s value='%s')" + " metadata(partition=%d, offset=%d)\n",
record.key(), record.value(), metadata.partition(), metadata.offset());
} finally {
public class producerTest {
public static void main(String[] args) throws Exception{
kafka_test objKafka=new kafka_test();
String pathFile="/home/cfms11/IdeaProjects/pooyaflink2/KafkaTest/quickstart/lastDay4.csv";
String delimiter="\n";
Scanner scanner = new Scanner(new File(pathFile));
int i=0;
if (i==0)
Because, I want to provide data for my Flink program, so, I use Kafka. In fact, I have this part code to consume data from Kafka topic:
Properties props = new Properties();
props.setProperty("", kafkaGroup);
FlinkKafkaConsumer<String> myconsumer = new FlinkKafkaConsumer<>(
"flinkTopic", new SimpleStringSchema(), props);
DataStream<String> text = env.addSource(myconsumer).setStartFromEarliest());
I want to run Producer code at the same time that my program is running. My goal is that Producer send one record to the topic and consumer can poll that record from topic at the same time.
Would you please tell me how it is possible and how to manage it.
I think you need create two class file, one is the producer, the other is the consumer. Create topic first and then run the consumer, or run the producer directly.

I am trying to put avro message using kafka streams but it is putting as a binary data type-java

I am trying to take jsonSerde as input from topics and should process the record and need to put it as Avro message in different topics using kafka streams.The output looks to be binary and the data is not in actual JSON format.Looks, like it is using default bytearrayserde for value and key.I don't know why but I'm providing the serializer as SpecificAvroSerde.
private final static JsonSerde<JsonNode> jsonSerde = new JsonSerde<JsonNode>(JsonNode.class);
private static Map<String, Object> props;
//Serde of specific record
private static SpecificAvroSerde<SpecificRecord> productValueSerde;
#Bean(name = KafkaStreamsDefaultConfiguration.DEFAULT_STREAMS_CONFIG_BEAN_NAME)
public StreamsConfig kafkaStreamsConfig()
throws UnknownHostException {
props = new HashMap<>();
props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:29092,localhost:19092,localhost:39092");
props.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, SpecificAvroSerde.class);
props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, SpecificAvroSerde.class);
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.put(ConsumerConfig.PARTITION_ASSIGNMENT_STRATEGY_CONFIG, "org.apache.kafka.clients"
+ ".consumer.RoundRobinAssignor");
productValueSerde = new SpecificAvroSerde<SpecificRecord>();
return new StreamsConfig(props);
public KStream<JsonNode,JsonNode> KStream(StreamsBuilder kStreamBuilder){
KStream<JsonNode,JsonNode> stream ="localtest",Consumed.with(jsonSerde, jsonSerde));
try {
KStream<JsonNode,SpecificRecord> avroStream = stream.flatMap((K,V)->actNationalPaperHelper.mapToCoreAvro(K, V));
//avroStream.flatMap((K,V)->System.out.println(V); return avroStream));
avroStream.through("serdetest16",Produced.with(jsonSerde, productValueSerde));
catch(Exception e) {
return stream;

Kafka: What is new API for ConsumerConnector in version 0.11

I'm updating Kafka client from to
In my old code, I use ConsumerConnector to get a message stream with the createMessageStreams method, and then iterate though the stream for each topic. However, it seems that ConsumerConnector has been depreciated in new API.
package kafka.consumer
import ...
* Main interface for consumer
#deprecated("This trait has been deprecated and will be removed in a future release.", "")
trait ConsumerConnector {
def createMessageStreams[K,V](topicCountMap: Map[String,Int],
keyDecoder: Decoder[K],
valueDecoder: Decoder[V]): Map[String,List[KafkaStream[K,V]]]
I looked up the new API and found two candidate:
Client API in org.apache.kafka.clients.consumer
Stream API in org.apache.kafka.streams
Which one should I use? And, how can I achieve the same thing in the new Kafka API?
An example of new consumer is as below:
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("", "test");
props.put("", "true");
props.put("", "1000");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Arrays.asList("foo", "bar"));
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records)
System.out.printf("offset = %d, key = %s, value = %s%n", record.offset(), record.key(), record.value());
See for further details.
