Kafka reading old and new value from topic - java

We have one producer-consumer environment, we are using Spring Boot for our project.
Kafka configuration was done by using class
#Configuration
#EnableKafka
public class DefaultKafkaConsumerConfig {
#Value("${spring.kafka.bootstrap-servers}")
private String bootstrapServers;
#Value("${spring.kafka.bootstrap-servers-group}")
private String bootstrapServersGroup;
#Bean
public ConsumerFactory<String,String> consumerDefaultFactory(){
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, IntegerDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.GROUP_ID_CONFIG, bootstrapServersGroup);
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest");
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, true);
return new DefaultKafkaConsumerFactory<>(props);
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, String> kafkaListenerDefaultContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerDefaultFactory());
return factory;
}
}
SCENARIO : We are writing some values on Kafka topics. Consider we have some topic where we are putting live data. Which have status like "live:0" for completed event and "live:1" for live event. Now when event going to be live it will get update and write on topic, and depending on this topic we are processing event.
ISSUE : When event get live I read data from topic with "live:1" and processed. But when event got updated and new data updated in topic.
Here now when new data updated on topic I am able to read those data. But with new data on topic, I am receiving old data too. Because I am getting both old and new data same time my event got affected. Some time it goes live some time in completed.
Anyone give any suggestions here on this?
Why I am getting committed data and newly updated data?
Any thing I am missing here in configuration?

you may want to check the couple of things:
-1. number of partitions
2. number of consumer
does it also means that you are re-writing the consume message to topic again, with new status?

try {
ListenableFuture<SendResult<String, String>> futureResult = this.kafkaTemplate.send(topicName, message);
futureResult.addCallback(new ListenableFutureCallback<SendResult<String, String>>() {
#Override
public void onSuccess(SendResult<String, String> result) {
log.info("Message successfully sent to topic {} with offset {} ", result.getRecordMetadata().topic(), result.getRecordMetadata().offset());
}
#Override
public void onFailure(Throwable ex) {
FAILMESSAGELOGGER.info("{},{}", topicName, message);
log.info("Unable to send Message to topic {} due to ", topicName, ex);
}
});
} catch (Exception e) {
log.error("Outer Exception occured while sending message {} to topic {}", new Object[] { message, topicName, e });
FAILMESSAGELOGGER.info("{},{}", topicName, message);
}
This what we have.

Related

Apache Kafka transactional producer does not honour atomicity while posting to 2 topics, if one topic goes down

I am using Kafka Transactional producer to post atomically to 2 topics on a broker. My code looks similar to this:
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("transactional.id", "my-transactional-id");
Producer<String, String> producer = new KafkaProducer<>(props, new StringSerializer(), new StringSerializer());
ProducerRecord<String, String> record1 = new ProducerRecord("topic-1", null, (Object) null, payload, headerList);
ProducerRecord<String, String> record2 = new ProducerRecord("topic-2", null, (Object) null, payload, headerList);
List<ProducerRecord<String, String>> recordList = Arrays.asList(record1, record2);
producer.initTransactions();
try {
producer.beginTransaction();
Iterator var2 = recordList.iterator();
while(var2.hasNext()) {
ProducerRecord<K, V> record = (ProducerRecord)var2.next();
this.send(record, (Callback)null);
}
producer.commitTransaction();
} catch (ProducerFencedException | OutOfOrderSequenceException | AuthorizationException e) {
// We can't recover from these exceptions, so our only option is to close the producer and exit.
producer.close();
} catch (KafkaException e) {
// For all other exceptions, just abort the transaction and try again.
producer.abortTransaction();
}
producer.close();
Now, in order to test the atomicity while posting to both the topics, I deleted "topic-2". I am expecting the transaction to fail completely. But strangely after several retries it commits transaction successfully to "topic-1".
Also, I am seeing continuous error logs with messages:
Error while fetching metadata with correlation id 123 :
{topic-2=UNKNOWN_TOPIC_OR_PARTITION}
But eventually it says
Transition from state IN_TRANSACTION to COMMITTING_TRANSACTION
and then posts successfully to "topic-1".
I am not sure why am I seeing this behaviour. What would possibly go wrong and is this behaviour expected?

Azure Event Hubs: Kafka Consumer polls nothing in java

I'm trying to Read messages in my consumer from azure event-hub
this is my consumer configurations:
private final KafkaConsumer<Integer, String> consumer;
private final String topic;
public SampleConsumer(String topic) throws IOException {
Properties props = new Properties();
props.load(new FileReader("src/main/resources/consumer.config"));
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,
"MyNamespace.servicebus.windows.net:9093");
props.put(ConsumerConfig.GROUP_ID_CONFIG, "GROUP_ID");
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "true");
props.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, "1000");
props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, "30000");
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
"org.apache.kafka.common.serialization.IntegerDeserializer");
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
"org.apache.kafka.common.serialization.StringDeserializer");
consumer = new KafkaConsumer<>(props);
this.topic = topic;
}
and this is my consumer.config file:
bootstrap.servers=<MyNamespace>.servicebus.windows.net:9093
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required
username="$**********" password="*******";
and finally in this method I try to read the records:
#Override
public void doWork() {
consumer.subscribe(Collections.singletonList(this.topic));
ConsumerRecords<Integer, String> records = consumer.poll(Duration.ofMillis(1000));
System.out.println(records.count());
for (ConsumerRecord<Integer, String> record : records) {
System.out.println("Received message: (" + record.value());
}
}
It connect correctly to the azure and event-hub.
It then tries to read the records but every time it returns 0 without any error messages. I searched on the internet and I used a lot of different solutions but nothing worked so far. There should be at least some records in event-hub.
Could someone please help me to figure out what the problem is?
Are you actively producing records? If not, then maybe you should consume from the beginning of the topic by setting auto.offset.reset=earliest for a new group id
The default behavior is to start at the end of the topic and poll new events after the consumer starts

Kafka Consumer getting already processed/duplicate records | Java | Spring Kafka

I am new to Kafka and writing a Cron job in Spring boot that validates some records in SQL vs Kafka topic. The job needs to run once a day in the morning. I have set the job to run after every 15th minute for my testing and it works as expected. But as soon as I updated the cron to run the job after every 2 hours I am getting records from topic that are already read/process and duplicate by the consumer. I am committing the offset manually with commitAsync. For Example I have sent 3 records to the topic but the consumer is getting more than 5k records mostly duplicates.
Following is the code of the consumer and its properties.
public Map<String, Object> getKafkaConsumerProps() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "false");
props.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, "1000");
props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, "30000");
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest");
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, "10000");
props.put(ConsumerConfig.GROUP_ID_CONFIG, "cronKafkaConsumer");
return props;
}
public List<CostKafkaModel> consumeCosts() {
KafkaConsumer<String, CostKafkaModel> consumer = new KafkaConsumer<>(
getKafkaConsumerProps(), new StringDeserializer(),
new JsonDeserializer<>(CostKafkaModel.class));
List<CostKafkaModel> kafkaModelList = new ArrayList<>();
try {
consumer.subscribe(Arrays.asList("deltaCosts"));
ConsumerRecords<String, CostKafkaModel> records = consumer
.poll(1000);
for (ConsumerRecord<String, CostKafkaModel> record : records) {
kafkaModelList.add(record.value());
}
} catch (Exception e) {
e.printStackTrace();
} finally {
consumer.commitSync();
consumer.close();
}
return kafkaModelList;
}
Any help would be appriciated.

Kafka consumer.poll returning empty

The producer code which will read a .mp4 video file from disc and sends it to kafka which apparently works since prints "Message sent to the Kafka Topic java_in_use_topic Successfully", but the consumer.poll is empty:
#RestController
#RequestMapping(value = "/javainuse-kafka/")
public class ApacheKafkaWebController {
#GetMapping(value = "/producer")
public String producer(#RequestParam("message") String message) {
Map<String, Object> props = new HashMap<>();
// list of host:port pairs used for establishing the initial connections to the Kakfa cluster
props.put(org.apache.kafka.clients.producer.ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,
"localhost:9092");
props.put(org.apache.kafka.clients.producer.ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
props.put(org.apache.kafka.clients.producer.ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, ByteArraySerializer.class.getName());
Producer<String, byte[]> producer = new KafkaProducer<>(props);
Path path = Paths.get("C:/kafka-picture-consumer/SampleVideo_1280x720_1mb.mp4");
ProducerRecord<String, byte[]> record = null;
try {
record = new ProducerRecord<>("topiccc", "keyyyyy", Files.readAllBytes(path));
} catch (IOException e) {
e.printStackTrace();
}
producer.send(record);
producer.close();
return "Message sent to the Kafka Topic java_in_use_topic Successfully";
}
The consumer code which will be used in a servlet:
public class ConsumerService {
public byte[] consumer(){
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "test");
props.put("enable.auto.commit", "true");
props.put("auto.commit.interval.ms", "1000");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.ByteArrayDeserializer");
KafkaConsumer<String, byte[]> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Collections.singletonList("topiccc"));
ConsumerRecords<String, byte[]> records = consumer.poll(100);
System.out.println("ISSSSSSSSSSSSSSSSSSSSSSSSSSSSS EMPTYYYYYYYYYY:"+String.valueOf(records.isEmpty()));
return records.iterator().next().value();
}
}
There can be many possible reasons for this:
Your producer is not sending the message. If this is the case, you can check it by adding callback to your producer and printing the exception. If the exception is null, then the send() is successful.
producer.send(record, (recordMetadata, exception) -> {
System.err.println(exception);
});
Since you are sending an mp4 file, I suppose that you might not have set your Kafka broker configurations and/or topic configurations to support such a large message.
Check the max.message.bytes and message.max.bytes configurations of topic and broker. In this case, you will get RecordTooLargeException
You will have to wait till the producer completely produces the message.
You will need to set auto.offset.reset to earliest in your consumer configurations. This ensures that if no offset data is there for that topic, then it will start consuming from the first message, otherwise it will wait for the next message.
Your poll duration is short, you may need to increase this.

java Kafka Producer- send large amount of messages

I am trying to send 300,000 messages to Kafka topic, each message with size ~1500 bytes.
The first 299,996 messages are sent is last than a minute, but then Kafka hangs out for a while and doesn’t send the rest of the messages. Very strange.
My Kafka producer configuration:
private void initKafka() {
Properties configProperties = new Properties();
configProperties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, kafkaLocation);
configProperties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringSerializer");
configProperties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringSerializer");
kafkaProducer = new KafkaProducer<String, String>(configProperties);
}
My code:
for (ResponseDocument responseDocument : documents.getDocuments()) {
try {
LinkedHashMap<String, Collection<? extends Object>> fields = convertToMap(responseDocument);
String jsonDoc = objectMapper.writeValueAsString(fields);
String docId = responseDocument.getFirstValueAsString(".id");
ProducerRecord<String, String> record = new ProducerRecord<String, String>(kafkaTopic, docId, jsonDoc);
kafkaProducer.send(record);
} catch (Exception e) {
LOGGER.error(ErrorCode.DATA_ACCESS_ERROR, "Failed to send document with .id {0}: {1}",responseDocument.getId(), e.getMessage());
}
}
I tried to play a little bit with the configuration, number of sent messages (changed to 250,000) and some more thinks..
Any idea?
Thanks is advance.

Categories