Need to Reduce time consumed by consumer to display messages - java

I am able to get the messages from the Kafka topics but it takes too much of time to read the messages. Is there any way to reduce the time consumption. And also I need to know whether we can get only last message from Kafka topic..... Please help me.
import java.util.Arrays;
import java.util.Properties;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer;
public class SimpleConsumer {
public static void main(String[] args) throws Exception {
// Kafka consumer configuration settings
String topicName = "you";
Properties props = new Properties();
props.put("bootstrap.servers", "192.168.174.132:9092");
props.put("group.id", "test");
props.put("enable.auto.commit", "true");
props.put("auto.commit.interval.ms", "1000");
props.put("session.timeout.ms", "60000");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("auto.offset.reset", "earliest");
#SuppressWarnings("resource")
KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(props);
// Kafka Consumer subscribes list of topics here.
consumer.subscribe(Arrays.asList(topicName));
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records) {
// print the offset,key and value for the consumer records.
System.out.printf("offset = %d, key = %s, value = %s\n", record.offset(), record.key(), record.value());
}
}
}
}

Related

How to make kafka consumer stop retrying and crash when broker is not available?

Currently, when broker is not available, I see that java consumer will try to reconnect to broker for infinite amount of time(or at least I was not able to wait till it stops).
What I would like to have is an ability to have an exception thrown from poll() when broker is not available. Or maybe some other way, but I want my app to crash when broker is not available. It seems that it should be very easy to configure, but I'm unable to find info anywhere.
Here is an example. It'll always return 0 records from poll and will retry connection to broker forever.
package org.example;
import org.apache.kafka.clients.consumer.ConsumerConfig;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer;
import java.time.Duration;
import java.util.List;
import java.util.Properties;
public class Consumer {
public static void main(String[] args) {
Properties props = new Properties();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9094");
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringDeserializer");
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringDeserializer");
props.put(ConsumerConfig.GROUP_ID_CONFIG, "test-consumer-group");
try (KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props)) {
consumer.subscribe(List.of("game.journal"));
while (true) {
ConsumerRecords<String, String> recs = consumer.poll(Duration.ofMillis(100));
for (ConsumerRecord<String, String> rec : recs) {
System.out.printf("Recieved %s: %s", rec.key(), rec.value());
}
}
}
}
}
So how can I make consumer crash when broker is not available?

Error using Avrodeserializer in consumer code - apache kafka

I am using kafka confluent open source version. I am using avro for serialization and deserialization. My producer java code is able to push the record in broker. But when my java consumer tries to read the data, i get below error.
Exception in thread "main" org.apache.kafka.common.errors.RecordDeserializationException: Error deserializing key/value for partition clickRecordsEvents-0 at offset 1. If needed, please seek past the record to continue consumption.
at org.apache.kafka.clients.consumer.internals.Fetcher.parseRecord(Fetcher.java:1429)
at org.apache.kafka.clients.consumer.internals.Fetcher.access$3400(Fetcher.java:134)
at org.apache.kafka.clients.consumer.internals.Fetcher$CompletedFetch.fetchRecords(Fetcher.java:1652)
at org.apache.kafka.clients.consumer.internals.Fetcher$CompletedFetch.access$1800(Fetcher.java:1488)
at org.apache.kafka.clients.consumer.internals.Fetcher.fetchRecords(Fetcher.java:721)
at org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:672)
at org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1304)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1238)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1211)
at com.ru.kafka.consumer.deserializer.avro.AvroConsumerDemo.main(AvroConsumerDemo.java:31)
Caused by: org.apache.kafka.common.errors.SerializationException: Error deserializing Avro message for id 1
at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserialize(AbstractKafkaAvroDeserializer.java:156)
at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserialize(AbstractKafkaAvroDeserializer.java:79)
at io.confluent.kafka.serializers.KafkaAvroDeserializer.deserialize(KafkaAvroDeserializer.java:55)
at org.apache.kafka.common.serialization.Deserializer.deserialize(Deserializer.java:60)
at org.apache.kafka.clients.consumer.internals.Fetcher.parseRecord(Fetcher.java:1420)
... 9 more
Caused by: org.apache.kafka.common.errors.SerializationException: Could not find class ClickRecord specified in writer's schema whilst finding reader's schema for a SpecificRecord.
I am able to read the records from consumer console command prompt.
Here's my consumer code:
import java.time.Duration;
import java.util.Collections;
import java.util.Properties;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer;
public class AvroConsumerDemo {
public static void main(String[] args) {
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "clicksCG");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "io.confluent.kafka.serializers.KafkaAvroDeserializer");
props.put("specific.avro.reader", "true");
props.put("schema.registry.url", "http://localhost:8083");
String topic = "clickRecordsEvents";
KafkaConsumer<String, ClickRecord> consumer = new KafkaConsumer<String, ClickRecord>(props);
consumer.subscribe(Collections.singletonList(topic));
System.out.println("Reading topic:" + topic);
while (true) {
ConsumerRecords<String, ClickRecord> records = consumer.poll(Duration.ofMillis(1000));
for (ConsumerRecord<String, ClickRecord> record : records) {
System.out.println("Current customer name is: " + record.value().getBrowser());
System.out.println("Session ID read " + record.value().getSessionId());
System.out.println("Browser read " + record.value().getBrowser());
System.out.println("Compaign read " + record.value().getCompaign());
System.out.println("IP read " + record.value().getIp());
System.out.println("Channel" + record.value().getChannel());
System.out.println("Refrrer" + record.value().getRefferer());
}
consumer.commitSync();
}
}
}
Note that, I have generated ClickRecord java class using avro schema file and avro tool plugin. Its not plain java pojo.
Here's my producer code:
import java.util.Properties;
import java.util.concurrent.ExecutionException;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.Producer;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.clients.producer.RecordMetadata;
import com.ru.kafka.avro.pojo.ClickRecord;
public class AvroProducerDemo {
public static void main(String[] args) {
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer", "io.confluent.kafka.serializers.KafkaAvroSerializer");
props.put("schema.registry.url", "http://localhost:8083"); // URL points to the schema registry.
String topic = "clickRecordsEvents";
Producer<String, ClickRecord> producer = new KafkaProducer<String, ClickRecord>(props);
ClickRecord clickRecord = new ClickRecord();
clickRecord.setSessionId("ABC1245");
clickRecord.setBrowser("Chrome");
clickRecord.setIp("192.168.32.56");
clickRecord.setChannel("HomePage");
System.out.println("Generated clickRecord " + clickRecord.toString());
ProducerRecord<String, ClickRecord> record = new ProducerRecord<String, ClickRecord>(topic,
clickRecord.getSessionId().toString(), clickRecord);
try {
RecordMetadata metdata = producer.send(record).get();
System.out.println("Record written to partition" + metdata.partition());
} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
} finally {
producer.close();
}
}
}

Kafka consumer showing numbers in unreadable format

I am trying out the kafka streaming. I am reading messages from one topic and doing groupByKey and then doing the count of groups. But the problem is that the messages count is coming as unreadable "boxes".
If I run the console consumer these are coming as empty strings
This is the WordCount code I wrote
package streams;
import org.apache.kafka.clients.consumer.ConsumerConfig;
import org.apache.kafka.common.serialization.Serdes;
import org.apache.kafka.streams.KafkaStreams;
import org.apache.kafka.streams.StreamsBuilder;
import org.apache.kafka.streams.StreamsConfig;
import org.apache.kafka.streams.kstream.KStream;
import java.util.Arrays;
import java.util.Properties;
public class WordCount {
public static void main(String[] args) {
Properties properties = new Properties();
properties.setProperty(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
properties.setProperty(StreamsConfig.APPLICATION_ID_CONFIG, "streams-demo-2");
properties.setProperty(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
properties.setProperty(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.StringSerde.class.getName());
properties.setProperty(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.StringSerde.class.getName());
// topology
StreamsBuilder builder = new StreamsBuilder();
KStream<String, String> input = builder.stream("temp-in");
KStream<String, Long> fil = input.flatMapValues(val -> Arrays.asList(val.split(" "))) // making stream of text line to stream of words
.selectKey((k, v) -> v) // changing the key
.groupByKey().count().toStream(); // getting count after groupBy
fil.to("temp-out");
KafkaStreams streams = new KafkaStreams(builder.build(), properties);
streams.start();
System.out.println(streams.toString());
Runtime.getRuntime().addShutdownHook(new Thread(streams::close));
}
}
This is the output I am getting in the consumer. It is there on the right side in image
I had tried casting the long to long again to see if it works. But it's not working
I am attaching the consumer code too if it helps.
package tutorial;
import org.apache.kafka.clients.consumer.ConsumerConfig;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer;
import org.apache.kafka.common.serialization.StringDeserializer;
import java.time.Duration;
import java.util.Collections;
import java.util.Properties;
public class Consumer {
public static void main(String[] args) {
Properties properties = new Properties();
properties.setProperty(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
properties.setProperty(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
properties.setProperty(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
// Once the consumer starts running it keeps running even after we stop in console
// We should create new consumer to read from earliest because the previous one had already consumed until certain offset
// when we run the same consumer in two consoles kafka detects it and re balances
// In this case the consoles split the partitions they consume forming a consumer group
properties.setProperty(ConsumerConfig.GROUP_ID_CONFIG, "consumer-application-1"); // -> consumer id
properties.setProperty(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest"); // -> From when consumer gets data
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(properties);
consumer.subscribe(Collections.singleton("temp-out"));
while (true) {
ConsumerRecords<String, String> consumerRecords = consumer.poll(Duration.ofMillis(1000));
for (ConsumerRecord<String, String> record: consumerRecords) {
System.out.println(record.key() + " " + record.value());
System.out.println(record.partition() + " " + record.offset());
}
}
}
}
Any help is appreciated. Thanks in advance.
The message value you're writing with Kafka Streams is a Long, and you're consuming it as a String.
If you make the following changes to your Consumer class, you'll be able to see the count printed correctly to stdout:
// Change this from StringDeserializer to LongDeserializer.
properties.setProperty(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, LongDeserializer.class.getName());
...
// The value you're consuming here is a Long, not a String.
KafkaConsumer<String, Long> consumer = new KafkaConsumer<>(properties);
consumer.subscribe(Collections.singleton("temp-out"));
while (true) {
ConsumerRecords<String, Long> consumerRecords = consumer.poll(Duration.ofMillis(1000));
for (ConsumerRecord<String, Long> record : consumerRecords) {
System.out.println(record.key() + " " + record.value());
System.out.println(record.partition() + " " + record.offset());
}
}

Writing Java Kafka consumer application without `while` loop

We recently started to use Kafka and I am writing a Kafka consumer application using Kafka Java native consumer API.
However most of the examples I saw are using a while loop and then call poll method on a consumer object in the loop. Like below:
while (true) {
final ConsumerRecords<Long, String> consumerRecords =
consumer.poll(1000);
if (consumerRecords.count()==0) {
noRecordsCount++;
if (noRecordsCount > giveUp) break;
else continue;
}
consumerRecords.forEach(record -> {
System.out.printf("Consumer Record:(%d, %s, %d, %d)\n",
record.key(), record.value(),
record.partition(), record.offset());
});
consumer.commitAsync();
}
I am just seeking a better way of doing this without a loop using the native Java consumer API. I know by using spring kafka you don't need to write those. How about using native API? Any good approach or best practice?
I tried using a scheduler and code is working.
package com.kafka;
import java.text.ParseException;
import java.util.Arrays;
import java.util.Date;
import java.util.List;
import java.util.Properties;
import java.util.TimerTask;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer;
public class ScheduledTask extends TimerTask {
Date now; // to display current time
public void run( ) {
now = new Date();
System.out.println("Time is :" + now);
String AlarmString=null;
Properties props = new Properties();
props.put("bootstrap.servers", "10.*.*.*:9092");
props.put("group.id", "grp-1");
props.put("enable.auto.commit", "true");
props.put("auto.commit.interval.ms", "1000");
props.put("session.timeout.ms", "30000");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Arrays.asList("RAW_MH_RAN_SAM"));
ConsumerRecords<String, String> records = consumer.poll(1000);
//ConsumerRecords<String, String> records =consumer.poll(Duration.ofMillis(100));
for (ConsumerRecord<String, String> record : records)
{
System.out.println("Consumer:=============== partition Id= " + record.partition() + " offset = " + record.offset() + " value = " + record.value() + "=================");
if (AlarmString==null && !(record.value().toString().contains("PR ALARM:")) ){
AlarmString=record.value();
}
else{
if( !(record.value().toString().contains("PR ALARM:")) )
{
//System.out.println("record.value() :::"+ record.value() );
AlarmString=AlarmString+","+record.value();
}
}
}
if (consumer != null) {
System.out.println("Closing Connection");
consumer.close();
}
}
}
//Simple consumer
package com.kafka;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer;
import com.google.gson.Gson;
import java.text.ParseException;
import java.time.Duration;
import java.util.Arrays;
import java.util.List;
import java.util.Properties;
import java.util.Timer;
public class SimpleConsumer{
public static void main(String[] args)
{
Timer time = new Timer();
ScheduledTask st = new ScheduledTask(); // Instantiate SheduledTask class
time.schedule(st, 0,60000); // Create Repetitively task for every 1 secs
}
}
Polling continuously is "just" the way how Kafka consumer works, how the underneath Kafka protocol works.
Each action is always initiated by the clients (pulling model) not by brokers (pushing model) that in case of consuming messages translates into polling.
Using the Java Kafka consumer API means having a loop, using a scheduler or whatever technology you have with Java for executing code continuously, you have to deal with it.
Other frameworks like Spring or Smallrye reactive messaging just do that for you. They are hiding the poll loop to your application but in the end there is always a loop ... it's how Kafka works.

ClassNotFoundException in Quick Start VM

I am new to Java. Apologies for repetitive question. I am trying to execute Java jar, but facing issue with classpath. Could someone please help me?
[cloudera#quickstart Desktop]$ ls -rlt Kafka_Consumer.jar
-rw-rw-r-- 1 cloudera cloudera 2227 Mar 9 03:28 Kafka_Consumer.jar
[cloudera#quickstart Desktop]$ java -cp Kafka_Consumer.jar org.abc.cde.KafkaConsumerTest
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/kafka/clients/consumer/KafkaConsumer
at org.abc.cde.KafkaConsumerTest.main(KafkaConsumerTest.java:27)
Caused by: java.lang.ClassNotFoundException: org.apache.kafka.clients.consumer.KafkaConsumer
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 1 more
Below is my Java code snippet without any error:
package org.abc.cde;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer;
import java.util.Arrays;
import java.util.Properties;
public class KafkaConsumerTest {
public static void main(String[] args) {
//consumer properties
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "test-group");
//using auto commit
props.put("enable.auto.commit", "true");
//string inputs and outputs
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
//kafka consumer object
KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(props);
//subscribe to topic
consumer.subscribe(Arrays.asList("my-topic"));
//infinite poll loop
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records)
System.out.printf("offset = %d, key = %s, value = %s\n", record.offset(), record.key(), record.value());
}
}
}
Please note that there are no compile time errors.
Thanks in Advance.

Categories