Azure Event Hubs: Kafka Consumer polls nothing in java - java

I'm trying to Read messages in my consumer from azure event-hub
this is my consumer configurations:
private final KafkaConsumer<Integer, String> consumer;
private final String topic;
public SampleConsumer(String topic) throws IOException {
Properties props = new Properties();
props.load(new FileReader("src/main/resources/consumer.config"));
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,
"MyNamespace.servicebus.windows.net:9093");
props.put(ConsumerConfig.GROUP_ID_CONFIG, "GROUP_ID");
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "true");
props.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, "1000");
props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, "30000");
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
"org.apache.kafka.common.serialization.IntegerDeserializer");
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
"org.apache.kafka.common.serialization.StringDeserializer");
consumer = new KafkaConsumer<>(props);
this.topic = topic;
}
and this is my consumer.config file:
bootstrap.servers=<MyNamespace>.servicebus.windows.net:9093
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required
username="$**********" password="*******";
and finally in this method I try to read the records:
#Override
public void doWork() {
consumer.subscribe(Collections.singletonList(this.topic));
ConsumerRecords<Integer, String> records = consumer.poll(Duration.ofMillis(1000));
System.out.println(records.count());
for (ConsumerRecord<Integer, String> record : records) {
System.out.println("Received message: (" + record.value());
}
}
It connect correctly to the azure and event-hub.
It then tries to read the records but every time it returns 0 without any error messages. I searched on the internet and I used a lot of different solutions but nothing worked so far. There should be at least some records in event-hub.
Could someone please help me to figure out what the problem is?

Are you actively producing records? If not, then maybe you should consume from the beginning of the topic by setting auto.offset.reset=earliest for a new group id
The default behavior is to start at the end of the topic and poll new events after the consumer starts

Related

Kafka Consumer getting already processed/duplicate records | Java | Spring Kafka

I am new to Kafka and writing a Cron job in Spring boot that validates some records in SQL vs Kafka topic. The job needs to run once a day in the morning. I have set the job to run after every 15th minute for my testing and it works as expected. But as soon as I updated the cron to run the job after every 2 hours I am getting records from topic that are already read/process and duplicate by the consumer. I am committing the offset manually with commitAsync. For Example I have sent 3 records to the topic but the consumer is getting more than 5k records mostly duplicates.
Following is the code of the consumer and its properties.
public Map<String, Object> getKafkaConsumerProps() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "false");
props.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, "1000");
props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, "30000");
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest");
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, "10000");
props.put(ConsumerConfig.GROUP_ID_CONFIG, "cronKafkaConsumer");
return props;
}
public List<CostKafkaModel> consumeCosts() {
KafkaConsumer<String, CostKafkaModel> consumer = new KafkaConsumer<>(
getKafkaConsumerProps(), new StringDeserializer(),
new JsonDeserializer<>(CostKafkaModel.class));
List<CostKafkaModel> kafkaModelList = new ArrayList<>();
try {
consumer.subscribe(Arrays.asList("deltaCosts"));
ConsumerRecords<String, CostKafkaModel> records = consumer
.poll(1000);
for (ConsumerRecord<String, CostKafkaModel> record : records) {
kafkaModelList.add(record.value());
}
} catch (Exception e) {
e.printStackTrace();
} finally {
consumer.commitSync();
consumer.close();
}
return kafkaModelList;
}
Any help would be appriciated.

Unable to send GenericRecord data from Kafka Producer in AVRO format

Using confluent-oss-5.0.0-2.11
My Kafka Producer code is
public class AvroProducer {
public static void main(String[] args) throws ExecutionException, InterruptedException {
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("ZOOKEEPER_HOST", "localhost");
//props.put("acks", "all");
props.put("retries", 0);
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer", "io.confluent.kafka.serializers.KafkaAvroSerializer");
props.put("schema.registry.url", "http://localhost:8081");
String topic = "confluent-new";
Schema.Parser parser = new Schema.Parser();
// I will get below schema string from SCHEMA REGISTRY
Schema schema = parser.parse("{\"type\":\"record\",\"name\":\"User\",\"fields\":[{\"name\":\"userName\",\"type\":\"string\"},{\"name\":\"uID\",\"type\":\"string\"},{\"name\":\"company\",\"type\":\"string\",\"default\":\"ABC\"},{\"name\":\"age\",\"type\":\"int\",\"default\":0},{\"name\":\"location\",\"type\":\"string\",\"default\":\"Noida\"}]}");
Producer<String, GenericRecord> producer = new KafkaProducer<String, GenericRecord>(props);
GenericRecord record = new GenericData.Record(schema);
record.put("uID", "06080000");
record.put("userName", "User data10");
record.put("company", "User data10");
record.put("age", 12);
record.put("location", "User data10");
ProducerRecord<String, GenericRecord> recordData = new ProducerRecord<String, GenericRecord>(topic, "ip", record);
producer.send(recordData);
System.out.println("Message Sent");
}
}
Seems like Producer code is ok and able to see Message Sent on the console.
Kafka Consumer code is:
public class AvroConsumer {
public static void main(String[] args) throws ExecutionException, InterruptedException {
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("ZOOKEEPER_HOST", "localhost");
props.put("acks", "all");
props.put("retries", 0);
props.put("group.id", "consumer1");
props.put("auto.offset.reset", "latest");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "io.confluent.kafka.serializers.KafkaAvroDeserializer");
props.put("schema.registry.url", "http://localhost:8081");
String topic = "confluent-new";
KafkaConsumer<String, GenericRecord> consumer = new KafkaConsumer<String, GenericRecord>(props);
consumer.subscribe(Arrays.asList(topic));
while(true){
ConsumerRecords<String, GenericRecord> recs = consumer.poll(10000);
for (ConsumerRecord<String, GenericRecord> rec : recs) {
System.out.printf("{AvroUtilsConsumerUser}: Recieved [key= %s, value= %s]\n", rec.key(), rec.value());
}
}
}
}
I am unable to see message(data) on the Kafka consumer end. Also I checked the offset count/status for confluent_new topic and its not updating. Seems like Producer code is having some problem.
Any pointer would be helpful.
Meanwhile below Producer code is working and here POJO i.e. User is avro-tools generated POJO.
public class AvroProducer {
public static void main(String[] args) throws ExecutionException, InterruptedException {
Properties props = new Properties();
kafkaParams.put("auto.offset.reset", "smallest");
kafkaParams.put("ZOOKEEPER_HOST", "bihdp01");*/
props.put("bootstrap.servers", "localhost:9092");
props.put("ZOOKEEPER_HOST", "localhost");
props.put("acks", "all");
props.put("retries", 0);
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer", "io.confluent.kafka.serializers.KafkaAvroSerializer");
props.put("schema.registry.url", "http://localhost:8081");
String topic = "confluent-new";
Producer<String, User> producer = new KafkaProducer<String, User>(props);
User user = new User();
user.setUID("0908");
user.setUserName("User data10");
user.setCompany("HCL");
user.setAge(20);
user.setLocation("Noida");
ProducerRecord<String, User> record = new ProducerRecord<String, User>(topic, (String) user.getUID(), user);
producer.send(record).get();
System.out.println("Sent");
}
}
P.S. My requirement is to send the received JSON data from source KAFKA topic to destination KAFKA topic in AVRO format. First I am infering AVRO schema from received JSON data using AVRO4S and registering the schema to SCHEMA REGISTRY. Next is to pull data from received JSON and populate in GenericRecord instance and send this GenericRecord instance to Kafka topic using KafkaAvroSerializer. At consumer end I will use KafkaAvroDeserializer to deserialize the received AVRO data.
In the course of finding solution I tried Thread.sleep(1000) and it fixed my problem. Also I tried producer.send(record).get() and this also fixed the problem. After going through Documentation I came across below code snippet and it hints the solution.
// When you're finished producing records, you can
flush the producer to ensure it has all been `written` to Kafka and
// then close the producer to free its resources.
finally {
producer.flush();
producer.close();
}
This is the best way to fix this problem.
Please try to add get() in first Producer
producer.send(recordData).get();

Kafka consumer.poll returning empty

The producer code which will read a .mp4 video file from disc and sends it to kafka which apparently works since prints "Message sent to the Kafka Topic java_in_use_topic Successfully", but the consumer.poll is empty:
#RestController
#RequestMapping(value = "/javainuse-kafka/")
public class ApacheKafkaWebController {
#GetMapping(value = "/producer")
public String producer(#RequestParam("message") String message) {
Map<String, Object> props = new HashMap<>();
// list of host:port pairs used for establishing the initial connections to the Kakfa cluster
props.put(org.apache.kafka.clients.producer.ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,
"localhost:9092");
props.put(org.apache.kafka.clients.producer.ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
props.put(org.apache.kafka.clients.producer.ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, ByteArraySerializer.class.getName());
Producer<String, byte[]> producer = new KafkaProducer<>(props);
Path path = Paths.get("C:/kafka-picture-consumer/SampleVideo_1280x720_1mb.mp4");
ProducerRecord<String, byte[]> record = null;
try {
record = new ProducerRecord<>("topiccc", "keyyyyy", Files.readAllBytes(path));
} catch (IOException e) {
e.printStackTrace();
}
producer.send(record);
producer.close();
return "Message sent to the Kafka Topic java_in_use_topic Successfully";
}
The consumer code which will be used in a servlet:
public class ConsumerService {
public byte[] consumer(){
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "test");
props.put("enable.auto.commit", "true");
props.put("auto.commit.interval.ms", "1000");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.ByteArrayDeserializer");
KafkaConsumer<String, byte[]> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Collections.singletonList("topiccc"));
ConsumerRecords<String, byte[]> records = consumer.poll(100);
System.out.println("ISSSSSSSSSSSSSSSSSSSSSSSSSSSSS EMPTYYYYYYYYYY:"+String.valueOf(records.isEmpty()));
return records.iterator().next().value();
}
}
There can be many possible reasons for this:
Your producer is not sending the message. If this is the case, you can check it by adding callback to your producer and printing the exception. If the exception is null, then the send() is successful.
producer.send(record, (recordMetadata, exception) -> {
System.err.println(exception);
});
Since you are sending an mp4 file, I suppose that you might not have set your Kafka broker configurations and/or topic configurations to support such a large message.
Check the max.message.bytes and message.max.bytes configurations of topic and broker. In this case, you will get RecordTooLargeException
You will have to wait till the producer completely produces the message.
You will need to set auto.offset.reset to earliest in your consumer configurations. This ensures that if no offset data is there for that topic, then it will start consuming from the first message, otherwise it will wait for the next message.
Your poll duration is short, you may need to increase this.

poll the data from kafka as and when topic gets updated with new data

I have a use case where I am streaming twitter data and storing it into kafka topic.
I want to show this data on UI as and when the topic gets updated. I am using websocket to get this.
I am facing problem while fetching the data from the kafka topic. I am new to kafka and am not sure how to get only the updated records from the topic.
Does setting the "auto.offset.reset" to "latest" solve my problem?
Please let me know if more information is needed to my question.
Adding my code..
public class ConsumerTest {
public static void main(String[] args) throws JsonParseException, JsonMappingException, IOException {
ConsumerTest test = new ConsumerTest();
test.startTwitterConsumer();
}
void startTwitterConsumer() throws JsonParseException, JsonMappingException, IOException {
Properties props = new Properties();
props.put("bootstrap.servers", "192.168.11.208:9092");
props.put("group.id", "group-04691");
props.put("enable.auto.commit", "false");
props.put("auto.commit.interval.ms", "500");
props.put("auto.offset.reset", "earliest");
props.put("session.timeout.ms", "30000");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("client.id", "w3l");
props.put("max.poll.records", "500");
KafkaConsumer<String, String> kafkaConsumer = new KafkaConsumer<>(props);
List<ConsumerRecord<String, String>> li;
kafkaConsumer.subscribe(Arrays.asList("dem"));
ConsumerRecords<String, String> records;
do{
records = kafkaConsumer.poll(1000);
if(records.count()>0)
{
//li = new ArrayList<>();
for(ConsumerRecord<String, String>record:records){
System.out.println(record.offset());
}
kafkaConsumer.commitAsync();
break;
}else{
System.out.println("No records");
}
}while(records.count()==0);
}
}
after executing this code each time I am getting the same offset values i.e., from 5 to 321.

Failed to read one message wth java-based Kafka consumer

I try to implement a java Kafka consumer. I use Kafka server version 0.9.
It's for test purpose, so all I have to do is to read one message.
public static ConsumerRecords<String, String> readFromKafka() {
ConsumerRecords<String, String> records = null;
try {
Properties kafkaProps = new Properties();
kafkaProps.put("bootstrap.servers", "<KAFKA_SERVER_HOST>:9092");
kafkaProps.put("auto.commit.enable", "false");
kafkaProps.put("value.deserializer", StringDeserializer.class.getName());
kafkaProps.put("key.deserializer", StringDeserializer.class.getName());
kafkaProps.put("client.id", "testScore0");
kafkaProps.put("group.id", "testScore1");
kafkaProps.put("auto.offset.reset", "latest");
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(kafkaProps);
consumer.subscribe(Arrays.asList("my_topic"));
records = consumer.poll(0);
} catch (Exception e) {
logger.error("Can not read from kafka", e);
}
return records;
}
The returned records object is empty:
I execute a command-line Kafka consumer on my local machine which connects to the same KAFKA_SERVER_HOST and do get messages.
change the poll time on
records = consumer.poll(0);
for something bigger than 0, try with 100.
records = consumer.poll(100);

Categories